An Implementation for Guaranteed Throughput in NoC using Retracing Wave Pipeline Method

DOI : 10.17577/IJERTV3IS100899

Download Full-Text PDF Cite this Publication

Text Only Version

An Implementation for Guaranteed Throughput in NoC using Retracing Wave Pipeline Method

Sachin C N Shetty,

Assistant Professor, Dept of ECE, Sahyadri College of Engineering and Management,

Adyar, Mangalore-575007.

Savidhan Shetty C S,

Assitant Professor, Dept of ECE, Sahyadri College of Engineering and Management,

Adyar, Mangalore-575007.

Abstract-It is a challenging task in a network-on-chip to design an on-chip switch/router to dynamically support (hard) guaranteed throughput under very tight on-chip constraints of power, timing, area, and time-to-market. The proposed work presents the design and implementation of a pipeline circuit-switched switch to support guaranteed throughput. The proposed circuit-switched switch, based on a backtracking probing path setup, operates with a source- synchronous wave-pipeline approach. The switch can support a dead- and live-lock free dynamic path-setup scheme and can achieve high bandwidth and high area and energy efciency. The synthesizable implementation of the proposed switch also results in a cost-effective design, fast development time, and portability.

Key TermsBacktracking, circuit-switched, dynamic path- setup, guaranteed throughput, network-on-chip (NoC), on- chip switch, source synchronous, wave-pipeline.


    This proposed system presents the design and implementation of a pipeline circuit-switched switch to support guaranteed throughput. The proposed circuit- switched switch, based on a backtracking probing path setup, operates with a source-synchronous wave-pipeline approach. The switch can support a dead and live-lock free dynamic path setup scheme and can achieve high bandwidth and high area and energy efficiency. The synthesizable implementation of the proposed switch also results in a cost-effective design, fast development time, and portability.

    In a network, the communication between the source node and the destination node plays a vital role. The packet switching has a problem that when the length of the communication channel is long, the queuing buffers increases thereby increase in the area. Hence, we consider the Circuit-Switching. When a communication channel is setup and source send node is sending the data to the destination destination node, when there is an interruption, the data may not get received by the destination node and there is a chance of data getting lost. For the successful transmission of data between the source node to the destination node, the source node takes an alternating path(Back tracking) to make the data reach the destination.

    That is a HOP-BY-HOP fashion transmission of data is considered. Using the Backtracking the higher throughput of the system can be achieved.

    This work advocates the guaranteed throughput implementation with the pure circuit-switching approach due to com- pact implementation of routers suitable for on- chip environment and an intrinsic hard QoS property after a circuit has been setup. A novel, practical pipeline circuit- switched switch design is proposed, termed backtracking wave-pipeline switch (or BW switch), to support on-chip hard guaranteed throughput applications. The proposed BW switch can meet the requirements of system exibility and scalability in managing the circuits by supporting a dynamic path-setup scheme in distribution.

    In NOC design, NOC system consists of the router/switch, IPs (CPU or other hardware module), and interconnection structure (topology) such as mesh, tree, torus etc. The design of basic 2*2 NOC structure including the switch, IP core and interconnections is as shown in the Fig.1.

    Fig 1 Block Diagram of Router.

    A time-division multiplexing (TDM) time slot and logical lanes (virtual circuits with priority) solu- tions are used for the worst-case scenarios in most practical packet-switched NoCs to provide guaranteed data service. The TDM

    approach faces difculty in the management of huge time- slot tables and restriction of the routing function for deadlock-free data transfer in the virtual circuits with a priority approach may lead to throughput degradation. The pure circuit-switching approach is favored to provide hard guaranteed throughput due to its attractive QoS property, once a circuit is set up end-to-end data can be pipelined in order at the full rate of the dedicated links with low delay, no data jitter, and in a lossless manner (i.e., without data dropping) due to there being no collisions among the data streams. Importantly, without queuing buffers and complex routing/arbitrating implementation, the circuit-switched router results in a low-cost (i.e., area, power) design suitable for the limited on-chip budget. The proposed technique supports a dynamic dead and live- lock free path-setup scheme in distribution by backtracking the setup header (probe header), without the need of a central control node or any additional network as in previous studies. The work in this paper provides a low fall-through latency and high multi-Gb/s bandwidth by direct-forwarding (i.e., wave-pipelining) of source- synchronous data suited to end-to-end source-synchronous data transfer.


    The key targets of a proposed BW switch architecture are to support the backtracking probing path setup scheme, and to allow direct-forwarding of source-synchronous data transmissions. As for the backtracking feature of the path- setup scheme, some design issues are first considered. Among varieties of backtracking protocol mentioned, the Exhaustive 3 Discussion of this high-level protocol is beyond the scope of this paper.

    Misrouting Backtracking (EMB), the k-family protocol, and the Exhaustive Profitable Backtracking (EPB), we propose to use EPB to reduce on-chip implementation complexity. Moreover, the proposed EPB-based path-setup establishes only minimum paths that result in energy- efficient data transfer. The EPB-based probing path-setup performs a straightforward depth-first search of the network using only profitable links. It does not repeatedly search the same path, and guarantees to find a minimal path if one exists.

    Regarding the design of the non-repeated searching feature, a critical consideration is how to store the history information used for backtracking, i.e., keeping in the probe header, or distributed storing in the switching nodes? The former method significantly increases probe header size, and, consequently, increases the required processing time to route the probe header through the network. It is particularly a problem when the number of links traversed during the path setup becomes very high. Therefore, the latter method is selected, in which the history information is distributed throughout the switching nodes of the network, to reduce the probe header size.

    Since the history information for backtracking is stored in switches, the probe header contains only the destination address, e.g., 6 b for a 64-node network. The probe header is handled to move forward or backward

    according to the control signals in switch-by-switch handshake. The incoming probe header can be transported through the data path to save the wiring costs due to the separation between the setup phase and the data transmission phase.

    A compact switch-by-switch handshake is proposed to support such end-to-end communication with the probing path-setup scheme. Table I denes the bit format used. Fig.

    2 illustrates the inter-switch and switch-wrapper inter- connections with this handshake. Each switch has ve bidirec- tional ports: four ports are connected to corresponding neigh- boring switches, and theremaining port is connected to the on-chip IP through a wrapper. According to this handshake scheme, one bit is used for the Request (Req) signal to denote the on-probing state (circuit re- quest) and the circuit idling state. Two bits are used for the An- swer (Ans) signal. This has one of three statuses to direct the backpressure ow-control to upstream switch. An Ans status of 01 denotes that the receiver is ready to accept data from the sender, whereas a status of 10 denotes that the intended path is blocked in the network, forcing the probe header to backtrack to discover possible alternative paths. An Ans status of 11 de- notes that the receiver is not ready to receive data (e.g., due to being busy, or having an overow at the receiving buffer).

    Fig 2. Switch-by-switch interconnection scheme.

    Arbiter Design

    Many input ports which are requestor want to access a common physical channel resource. In this case, an arbiter is required to determine how the physical channel can be shared amongst many requestors. When we think about arbitration logic, we have to consider many factors. An arbiter module handles all the requests for the output ports of a crossbar. When a packet is injected into an input port of a router, it is directed to FIFO buffer. The FIFO module sends the routing address of packet to the arbiter as a request event. At each rising edge of clock cycle, the arbiter first checks whether any output ports is free or not. If its free the arbiter enables the free output bit related to the output port. Enabling this bit means that the related output port is ready to operate. Then the arbiter checks its request inputs. If any request is activated, it reads the destination address and checks whether the output address is free or not. If it is free, packet will go through that output port. The arbiter then disables a specific bit in the variable, free output meaning that no data can be sent through the port. This bit stays disable until next clock event. If the output port is not available, the request will stay

    unanswered until next clock event. In a falling edge of clock, arbiter takes care of credit out signals.

    FIFO Buffer

    A simple schematic of typical SRAM based FIFO is shown in Fig 3.4. Two pointers (read & write) point to the address of SRAM where data is read or write respectively. In the figure, if read event occurs, the cell p1, which is pointed by read pointer, appears at output, then read pointer is increased one bit (now its point to p2). If write event occurs, the input data will save at the location addressed by write pointer, then write pointer will increased one bit. The difference between write pointer and read pointer determines the FIFO is full or empty. A full condition occurs when write causes the difference of two pointers to be equal to the depth of FIFO, and an empty condition occurs when read causes the difference of two pointers to be equal to zero. Both read and write pointer increases circularly as shown in fig 3.

    Fig 3 Schematic of SRAM based FIFO.

    When FIFO is full, no data is allowed to be saved in FIFO. The status count block subtracts the contents of read pointer and write pointer to create full or empty conditions.


    The crossbar with internal transceivers is the key component to perform the wave-pipelining of source- synchronous data. Regarding the layered design concept in the NoC paradigm the router/switch and the transceiver (with inter-router link) can be designed independently. They can cooperate in NoCs, provided the interface between them is dened. As introduced the design of source-synchronous transceivers (with wave-pipelined links) becomes common practice. It is studied to improve energy efciency and data rate and to combat PVT variations, random mismatch, and crosstalk.

    In these transceivers, the received data can be re- aligned with the received (source) clock (as in circuit-switched NoC), or with a local (router) clock (often with synchronizing rst-in rst-out (FIFO), as in packet- switched NoC). In cooperating with a circuit-switched switch, like the BW switch, the realignment of data to the source clock can be applied. Regarding this scheme, the common interface between the switch and the source- synchronous transceiver is the data (from data registers) and source clock signals. In the BW switch, the concept of wave-pipeline is illustrated in the sense that it allows direct-forwarding of the source-synchronous data (i.e., data along with source clock) from its inputs to the corresponding outputs.

    Backtracking Path-Setup Scheme

    The path-setup scheme is essential and directly affects the overall performance of the circuit-switching approach. An analysis at the network-level confirmed the good performance of the backtracked routing circuit-switched NoC with the torus topology under certain communication patterns. In particular, in communications with larger packets, the data transmission duration can be long. This overwhelms the setup delay overhead, hence, improving the overall network performance. This section focuses on analyzing the property and the network-level performance of the proposed EPB-based probing path-setup scheme supported by the BW switch. The use of the switch-by- switch handshake is illustrated through examples of end-to- end flow-control operations, as

    shown in Fig. 4. Fig. 4(a) denotes an end-to-end communication example where a successful path-setup without backtracking occurs.

    Fig 4. Examples of the end-to-end flow-control operation where the switch-by-switch handshake is used.(a) Successful path-setup without backtracking. (b) Successful path-setup with backtracking. (c) Failed path-setup due to busy destination, and a retry. (d) Failed path-setup due to all possible paths are blocked, and a retry.

    Fig. 4(b) shows an example similar to one in Fig. 4(a), where backtracking occurs in the setup phase.

    Fig. 2(c) and (d) illustrates the use of this Ans signal when a path-setup fails, and then a retry is reiterated. In summary, this section has introduced the concept of the probing path-setup scheme working with the compact switch-by-switch handshake. This is used in the proposed BW switch.


    1. The proposed technique supports a dynamic dead- and livelock free path-setup scheme in distribution by backtracking the setup header.

    2. The work in this paper provides a low fall through latency and high multi-Gb/s bandwidth by direct- forwarding (i.e.wave-pipelining) of source-synchronous data suited toend-to-end source-synchronous data transfer.


    In this paper, the HDL-based implementation of the BW switch, using standard cells, can result in short design time and has good portability. There is room for further optimization due to the separate implementation of the data path from the control part.

    Fig 5 Sixth node is backtracking.


The BW switch to support guaranteed throughput can be considered in future work with real-world NoC-based applications.


  1. L. Benini and G. De Micheli, Networks on chips: A new SoC paradigm, IEEE Computer, vol. 35, no. 1, pp. 7078, Jan. 2002.

  2. K. Goossens, J. Dielissen, and A. Radulescu, Æthereal network on chip: Concepts, architectures, and implementations, IEEE Des. Test.Comput., vol. 22, no. 5, pp. 414421, 2005.

  3. G. D. Micheli and L. Benini, Networks on Chips: Technology and Tools (Systems on Silicon). San Mateo, CA: Morgan Kaufmann, 2006.

  4. D.Wiklund and L. Dake, SoCBUS: Switched network on chip for hard real time embedded systems, in Proc. Int. Parallel Distrib. Process. Symp., 2003, p. 8.

  5. P. T. Wolkotte, G. J. M. Smit, G. K. Rauwerda, and L. T. Smit, An energy-efficient reconfigurable circuit-switched network-on-chip, in Proc. IEEE Int. Parallel Distrib. Process. Symp., 2005, p. 155a.

  6. S. R. Vangal, J. Howard, G. Ruhl, S.Dighe, H. Wilson, J. Tschanz,

    D. Finan, A. Singh, T. Jacob, S. Jain, V. Erraguntla, C. Roberts, Y. Hoskote, N. Borkar, and S. Borkar, An 80-tile sub-100-W TeraFLOPS processor in 65-nm CMOS, IEEE J. Solid-State Circuits, vol. 43, no. 1, pp. 2941, Jan. 2008.

  7. D. Lattard, E. Beigne, F. Clermidy, Y. Durand, R. Lemaire, P. Vivet, and F. Berens, A reconfigurable baseband platform based on an asynchronous network-on-chip, IEEE J. Solid-State Circuits, vol. 43, no. 1, pp. 223235, Jan. 2008.

  8. N. E. Jerger, M. Lipasti, and L.-S. Peh, Circuit-switched coherence,

    IEEE Comput. Archit. Lett., vol. 6, no. 1, pp. 58, Jan.-Jun. 2007.

  9. J. Duato, S. Yalamanchili, and L. Ni, Interconnection Networks: An Engineering Approach. San Mateo, CA: Morgan Kaufmann, 2003.

  10. Design and Implementation of Backtracking Wave-Pipeline Switch to Support Guaranteed Throughput in Network-on-Chip. Phi-Hung Pham, Student Member, IEEE, Jongsun Park, Member, IEEE, Phuong Mau, and Chulwoo Kim, Senior Member, IEEE

Leave a Reply