Implementation of Multicast Router using Mulicast Rotational Routing Algorithm

Download Full-Text PDF Cite this Publication

Text Only Version

Implementation of Multicast Router using Mulicast Rotational Routing Algorithm

Monika A Vijayalakshmi Y

MTech [Electronics] Associate Professor,

E&C Dept., SIR MVIT E&C Dept., SIR MVIT

Bangalore, Karnataka, India Bangalore,Karnataka,India

Abstract-Routers may encounter the situation of hotspot and congestion also does not have function of multicast. And to handle several destination addresses at the same time, recombine and to avoid congestion, we propose a routing arithmetic called Multicast Rotational Routing Algorithm (MRRA). This helps in sending the same data to different destination nodes. Merging NOC to third dimension has been proposed as a promising solution offering lower latency and high throughput. The 3D NoC router is designed in Verilog HDL, simulated by Modelsim, synthesized and prototyped by Synplify on FPGA respectively. The experiment results show that the multicast router of MRRA can achieve a very good performance in terms of throughput and delay.

Keywords: NOC Router, Multicast, Routing Arithmetic, 3D NOC, Rotational Routing

  1. INTRODUCTION

    In computer networks, networked computing devices pass data to each along data connections. Data is transferred in the form of packets. Network packet is a formatted unit data carried by a packet switched network. To support for the network it is necessary to adopt hardware components, makes transmission of data easier. There are many different systems for the network in transmission of data. Data will be of different form, and this can be transformed depending on the requirements. With the development of Integrated Circuit, billions of transistors are integrated on a single chip. Transistors, the basic composing elements of chips, are decreasing in size and increasing in number. This is demonstrated by the Moore's law in 1965. The Moore's law states that every 18 month the number of transistors and thus hardware blocks doubles on the chip. Due to this incident the density of transistors and hardware blocks are increasing on the chip exponentially. This caused to forming a complete system on a chip that called System on Chip (SoC). As a consequence, in today's chip design, communication between on-chip cores is becoming more expensive than on-chip computation [1].

    For the development of better IC industry many researchers want to find new technology. Providing an interconnect system for the chip is becoming more costly than on-chip blocks. Therefore, the designers has proposed to exchange the conventional on-chip interconnect system called new on-chip interconnect, named as Network on Chips (NoC). It provides high possibility for reusability. NoC architecture is composed of routers, Network Interfaces (NI) and links.

    NOC is constructed from multiple point to point data links interconnected, such that messages can be relayed from source to multiple destination addresses at the same time without any delay. Busses/shared buses were main means to connect the components. But due to the increased advancements in SOC and NOC technologies, problems related to bus appeared. The number of interconnection lines increased with the development of very large scale integrated circuits and with design methods of NOC. Due to which there is restricted performance in area, speed and power consumption. So a new technology is needed to break the restrictions and to enhance, improve the chip performance. Hence to uphold for such systems 3D NOC is introduced. Three dimensional NOC is an integrated circuit which consists of several dies stacked by through silicon via. Comparably with 2D NOC it has high throughput, reduced power and improved performance, eliminating many limitations of plane. 3D NOC which considered as new method for high speed growth of chip integration by which degree increases significantly.The major advantage of 3D ICs is the considerable reduction in the length of global interconnections, resulting in an increase in the performance and a decrease in the power consumption and area of wire limited circuits.

    There are Different types of 3D NOC topologic structures; some are 3D Torus, 3D Mesh, 3D Stacking Mesh and so on. In which 3D Mesh topologic structure is quite efficient and caused extensive concern. mesh topology has more efficient than others ,as each and every nodes are interconnected to each other there is no delay in transmission so increases the performance. Other topologic structures are also data enhance providers even they have their own importance with their respective applications [7]. Combining NOC with 3D integration leads for a new architecture called 3D NOC and this architecture increases scalability of the system.So adopting a appropriate routing algorithm to get the balance between the time delay and throughput rate becomes the key problem. In this paper we are concerned about low latency, high throughput, low overhead and high performance for NOC router. Deterministic routing algorithm has been adopted called multicast rotational routing algorithm (MRRA). This MRRA avoids congestion and hot spots, also mainly supports for multicast transmission. At the same time, it can keep the hardware complexity as reasonable as possible.

  2. MULTICAST TRANSMISSION

    Routing is a process of selecting best path in a network. It directs packet transmission through intermediate node. Most of the papers are focused on transmission of data from node to node using unicast. But unicast delays in transmission of information as it is single path. So sending same data from a node to several node is required, which is achieved by multicast as shown in

    Fig 1.(S: Source node, D1, D2, D3: Destination nodes)

    Fig 1: Multicast transmission node path

    If node S wants to send packet data to destinations D1, D2, D3 it needs send thrice separately, in case of unicast. Here with multicast routing same data can be sent to all three destination nodes at a time without any delay. By multicast, firstly, when the packet arrives at node D1, it will be sent to D1s IP core. Secondly, at the same time, it is also copied and sent to node A3 and A2. At last, it will be sent to D3 and D2 through A2 and A3.So it advantage to use multicast routing instead unicast, which improves throughput and latency. Also overcome the problem of delay, improves performance of the respective system.

  3. RELATED WORK

    Network on chip has been a rapidly developed concept in recent years to tackle the crisis with focus on network based communication. Buses and point to point connections were main mean of transmission. But new technologies advances further, problems related to bus appeared the most. First, buses do not scale as the number of communication partners connected become higher. Second, long and global wires and buses become undesirable due to their low and unpredictable performance, high power consumption and noise phenomenon. Third, due to the unpredictability of the communication performance, designing and verifying a large bus based communication networks is very hard. Fourth, every system has a different communication structure, making its reuse difficult [10].

    Micro-architectural configurations of buffers in routers have a significant impact on the overall performance of an on-chip network (NoC). This buffering can be at the inputs

    or the outputs of a router, OBRs(0utput buffer router) are attractive because they have higher throughput and lower queuing delays under high loads than IBRs(input buffer router). However, a direct implementation of OBRs requires a router speedup equal to the number of ports, making such a design prohibitive give the aggressive clocking and power budgets of most NoC applications.so based on distributed shared buffer efficient pipelining and novel flow control is achieved.

    In XY routing algorithm, a packet must always be routed along horizontal or X axis until it reaches the same column as that of destination. Then it should be routed along vertical or Y axis and towards the location of destination resource. Moves only in 2D dimension, so to enhance the system new routing algorithm is introduced with Z direction called 3D routing arithmetic.

    Paper [1] have described an adaptive routing algorithm which is based on deterministic XY routing algorithm. In their model a switch is a context-aware agent and a network is a society of context-aware agents which are ever learning and adapting to distribute the congestion uniformly and isolate the malformed switches (agents). In conventional XY routing, first, the load in the center of a network is much higher rather than total average and this leads to hot spot in the center of network. And second, a malfunction in switches could make part of network out of access. But in their proposed routing, first, all agents are aware from their neighbor's congestion and collaboratively try to route their input packets through less congested route, according to their experiences learned before and second, when a new malfunction takes place in the network, all agents collaborate each other to recognize the malformed agent and learn the best route. The main objective of their routing algorithm is to distribute network load uniformly throughout the network.

    Paper [2] have described a pseudo adaptive routing which is an extension of classic XY routing. They consider mesh topology for evaluating proposed routing. Their switches use Pseudo adaptive XY routing algorithm. The load in the center of a network in ordinary XY routing is much higher rather than total average. This extra load on the center of mesh can cause spot hot. The main objective of their routing algorithm is to distribute network load. Their routing algorithm has two modes that is deterministic and adaptive Packets are routed with classic XY routing (deterministic mode), when congestion in the network is low. When congestion is high, packets will be routed through less congested route adaptively.

    Paper [3] have presents an improved topology called Tmesh, which is based on standard mesh network by inserting four long links to connect the vertices to reduce the communication delay between some remote nodes. They also present a deadlock-free routing algorithm for Tmesh named TXY algorithm. The results of this algorithm show a certain reduction in the average packet delay and routing hops.

    An efficient hierarchical router for large 3D NoCs was presented in [4]. This router is composed of 2 decoupled modules, one for intra-layer communication and other for

    inter-layer communication.so we get improved throughput and latency.

    Paper [5] describes a parallel multicast testing method to avoid hot spot and put forward an idea of multicast router. But here fails to use that multicast router architecture.

    In this paper[6], presents an efficient routing algorithm for 3D-NoC named Look-Ahead-XYZ (LA-XYZ). This algorithm aims to minimize the communication latency and power consumption while enhancing the system throughput. Comparison results with systems adopting two dimensional routing showed that, using JPEG encoder and Matrix-multiplication applications, LA-XYZ reduces the communication latency. In paper [7], there would be advance routing computing tasks. Thus throughput of the

    router could be improved. But it is not suitable for multicast transmission.

    In paper [8], implementation of a router for vertically- partially-connected 3D-NoCs based on stacked 2D-meshes. This router implements the necessary hardware to support a recently introduced routing algorithm called "Elevator- First", which targets topologies with irregularly placed vertical connections in a deadlock free manner, using only two virtual channels in the plane. The micro-architectural design shows that the proposed router requires little additional hardware. Our studies about the practicality of the algorithm and its router implementation demonstrate that it has low overhead compared to a router for fully connected 3D NoCs. As it uses Elevator-first Routing

    proposed algorithm is implemented in our multicast 3D NoC router. This 3D NOC router is designed in Verilog, simulated by Modelsim, synthesized and prototyped by synplify on FPGA respectively. the proposed multicast router supporting MRRA has a very good performance in terms of throughput and delay. The increased area utilization also can be tolerated.

  4. 3D MULTICAST ROTATIONAL ROUTING ARITHMETIC

We propose a low latency, low overhead, high throughput 3D NOC router arithmetic called MULTICAST ROTATIONAL ROUTING ARITHMETIC based on

multicast dimension order routing algorithm, in order to avoid congestion and delay and also to realize multicast function. In traditional dimension order routing algorithm, packages are always transmitted along the same direction until get the destination dimensionality. Our algorithm can avoid network congestion transmit packet through direction of x-axis and y-axis respectively. Algorithm is shown in Fig 2 and process is enlarges for its appropriate transmission of packets.

The rotational routing arithmetic flow for 3D multicast router is shown in Fig 2.

  1. Router first sends packets to their destination layers through the direction of z-axis.

    Down

    Down

    Up

    Up

    Algorithm can transmit packets to their destination nodes in irregular topological structure.

    Paper [9] proposes a new network-on-chip (NoC) that handles accurate localizations of the faulty parts of the NoC. The proposed NoC is based on new error detection mechanisms suitable for dynamic NoCs, where number and position of processor elements or faulty blocks vary during runtime. Indeed, it proposes online detection of data packet and adaptive routing algorithm errors. Both presented mechanisms are able to distinguish permanent and transient errors and localize accurately the position of the faulty blocks in the NoC routers, while preserving the throughput,

    CZ=DZ

    y

    N

    CZ>DZ

    y

    N

    CX>DX

    N

    CY=DY

    y

    N

    Flag Direction 1 North 0East

    Flag Direction 1 North 0East

    N

    Flag Direction 1Southth 0East

    Flag Direction 1Southth 0East

    CY>DY

    Flag Direction 1North

    0Wast

    Flag Direction 1North

    0Wast

    y

    East

    East

    N

    the network load, and the data packet latency.

    Dimension Order Routing (DOR) is a typical minimal turn algorithm. The algorithm determines to what direction packets are routed during every stage of the routing. DOR is used with XYZ algorithm for better efficient performance but does not supports for multicast, uses unicast. Due to the complexity of routing control, 3D NOC often uses fixed priority routers based on the static XYZ

    y

    N

    CX=DX

    y

    N

    N

    CY=DY

    West

    West

    y

    N

    CY>DY

    CY>DY

    Flag Direction 1Southth 0West

    Flag Direction 1Southth 0West

    y

    North

    North

    routing algorithm. Packages are transmitted from source to destination nodes through the direction of z-axis, then depending on the conditions employed over moves towards x-axis and y-axis. Sometimes it is difficult to adopt dynamic changes of network flow, also could cause congestion and hotspot.

    In this paper, we propose a low latency, low overhead, High throughput 3D NoC router arithmetic named Multicast Rotational Routing Algorithm (MRRA). The MRRA supports multicast transmission, and can avoid congestion and hot spot. At the same time, it can keep the hardware complexity as reasonable as possible. The

    CY=DY

    South

    South

    y

    Local

    Local

    y

    Fig 2: 3D Rotational Routing Algorithm

    Start

    Start

    1. ROUTER ARCHITECTURE

      No.of destination address

      Use routing algorithm shown in Fig 2.

      Use routing algorithm shown in Fig 2.

      Split and combine packet based on direction of destion address

      Split and combine packet based on direction of destion address

      Current layer=desti nation layer

      Transmit it to its destination layer

      Transmit it to its destination layer

      Destination nodes in same layer

      The 3D NOC router architecture is based on 2D NOC architecture, where 2D NOC router consist of 5 inputs ports and 5 output ports, they are North, South, East, West and Local. Where as in 3D NOC 2 input ports and 2 output ports that is up and Down ports are included in this paper. The architecture mainly consists of input ports, output ports, Packet Recombining, Routing and virtual memory allocator. Even though both 3D integrated circuits and NOCs are proposed as alternatives for the interconnect scaling demands, the challenges of combining both approaches to design three-dimensional NOCs have not been addressed.

      Use routing algorithm shown in Fig 2.

      Use routing algorithm shown in Fig 2.

      Transmit them to their destination layer

      Transmit them to their destination layer

      Every input buffer port buffer has a 2 queues and every queue has a virtual channel of its own to store intermediate values. The routing computing module fetches the next node address and packet recombining module which recombines the packet for multicast. When a packet arrives one of the routers in 3D NOC, it is stored in the FIFO input buffer. If other packets which were stored before it have been taken out from the input buffer, current packet will be sent to the routing computing.

      Each and every module plays their respective role depending on the respective conditions.

      Use routing algorithm shown in Fig 2.

      Use routing algorithm shown in Fig 2.

      Fig 3: Multicast Rotational Routing Arithmetic

  2. Then sends packets to their destination nodes through the direction of x and y axis:

  3. In order to transmit packet evenly, we add flag(size of flag will be one bit )

  4. If there is only one output can be choose, the packet will be transmitted through this output.

  5. When a packet reaches one of the routers and the directions of x-axis and y-axis output ports are available simultaneously, it will be transmitted through the direction of x-axis if the flag is 0.

    North

    B Recombie

    U F

    F Routing

    Crossbar

    UP

    B U

    F Recombinee

    F

    Routing

    Buffer

  6. If the flag is 1, it will be transmitted through the direction of y-axis.

  7. Reverse the flag.

If packet carries several destination addresses then transmission method shown in Fig 3 is as follows:-

  1. Router will analyze for the addresses which the packet contains and split the packet.

  2. The data will combine with these addresses whose directions of destination nodes are the same.

    west

    Down

    Recom R

    binee o u

    Buffer

    R

    Recom o

    binee u ti

    B Recombin

    U F

    F Routing

    R Rec

    O omb

    U inee

    East

  3. Several separate data packets which will send to their destination nodes based on the rule which has expounded in context are packed or else follow with the Fig 2.

Buffer South

Fig 4: 3D Multicast Router Architecture

Local

  1. REALIZATION OF 3D MULTICAST ROUTER

    3D NOC multicast router is implemented based on multicast rotational routing algorithm, which uses domain routing algorithm and wormhole like switching. The next parts of the paper give overview of router architecture (computing, combining, Routing) and results.

    . This module computes several next-node addresses based on the current-node address and the destination addresses which are carried by the multicast packet. Then the packet is recombined by the packet recombining module based on the different next node addresses. At the same time, the output-ports of several recombined packets are arbitrated based on our 3D multicast rotational routing algorithm.

    New packets and their output-port information will be sent to the crossbar. At last, the crossbar sends them to their destination output-ports.

    B. ROUTING COMPUTING MODULE

    In multicast rotational routing algorithm each one of the seven ports (which are the east, west, south, north, up, down and local ports) is composed of three main elements: input buffer, routing computing module and packet recombining module. Among them, the routing computing module is one of the most important parts in our router.

    The data packet will be sent to the routing computing module from the FIFO input buffer when other packets which were stored before it have been taken out from the buffer. At the same time, the current-node addresses (current x-address, current y-address and current z-address) are also sent to the routing computing module. In order to concurrently compute destination output-port of several different destination addresses, we set several routing comparers.

    B. PACKET RECOMBINING MODULE

    Packet recombining module plays quite important role in multicast router. It also solves the problem of congestion and hotspots. In this module, packet may be split into several separate data packets. When the packet recombining module receives the signal of request from the routing computing module, and if it is not busy, the data packet and several destination output-ports information of different destination addresses will be received.

    The destination output-port data gives the information of direction relative to the current node. The destination addresses with the same direction will be recombining together with the data part. After that, there are several separated data packets in this module. Their destination output-ports will be recomputed through read the flag of rotational routing which is one bit. If the flag is 0 and the direction of x-axis and y-axis output ports are available simultaneously, it will be transmitted through the direction of x-axis. If not, it will be transmitted through the direction of y-axis. Then the flag is reversed.

    These packets and their output-port information are sent to the crossbar. The crossbar transmits these packets to their destination output-port. At last they will be sent to next node through output-port.

    1. EVALUATION

      The system is evaluated by designing Verilog HDL and simulated using modelsim and synplified on FPGA. To verify the validity of our 3D Multicast Rotational Routing Algorithm, we first use the Modelsim which is a very famous commercial simulation tool in the electronics industry. Make clear that the data packet can be transmitted to their destination output-port based on the destination addresses which it takes.

      Fig 5: Simulated waveform

      So our routing algorithm can satisfy the requirements of function and timing. The Fig 5. Shows output for simulated 3D Multicast Rotational Routing Algorithm. Output is based on respective outcome of the nodes. Here in this paper considering for 4*4 Mesh topology results are analyzed. Depending on the usage of slices on FPGA respective module have been created for slice usage which permits to know the results of evaluated structure.

    2. CONCLUSION

In this paper, it is presented 3D Multicast Rotational Routing Algorithm, which is used in our designed 3D NoC. It can handle several destination addresses at the same time, recombine packets, and send the same data to their different destination nodes. With the use of rotational routing algorithm we can avoid the congestion. Hence the proposed multicast router can achieve a very good perfomance in terms of throughput and delay.

REFERENCES

  1. Ruizhe Wu, Yi Wang & Dan Zhao,A Low Cost Deadlock- free Design of Minimal-Table Rerouted XY-Routing for Irregular Wireless NOCs, 2010, Fourth ACM/IEEE International Symposium on Networks-on-Chip, pp. 199 206.

  2. Mohsen Nickray, Masood Dehyadgari & Ali Afzali- kusha, Adaptive Routing Using Context-Aware Agents for Networks on Chips, IEEE, 2009, pp. 1-6.

  3. Masood Dehyadgari, Mohsen Nickray, Ali Afzali- kusha, Zainalabein Navabi, Evaluation of Pseudo Adaptive XY Routing Using an Object Oriented Model for NOC, IEEE, 2005, pp. 204- 206.

[4]F. Fang, Y. H. Han, X. W. Li. A Thermal-Aware Parallel Multicast Testing Method Based on Many-Core Chips, Journal of Computer- Aided Design & Computer Graphics, vol.22, no.5, pp. 845-851, 2010.

  1. A. B. Ahmed, A. B. Abdallah, LA-XYZ: Low Latency, High Throughput Look-Ahead Routing Algorithm for 3D Network-on- Chip (3D-NoC) Architecture, Proceedings of IEEE 6th International Symposium on Embedded Multicore SOCs, pp.168- 174, 2012.

  2. W. Lafi, D. Lattard, A. Jerraya, An efficient hierarchical router for large 3D NoCs, Proceeding of Rapid System Prototyping (RSP), 21st IEEE International Symposium, pp.105-110, 2010.

  3. M. Bahmani, A. Sheibanyrad, etc., A 3D-NoC Router Implementation Exploiting Vertically-Partially-Connected Topologies, Proceeding of IEEE Computer Society Annual Symposium on VLSI, pp.9-14, 2012.

  4. Cédric Killian, Camel Tanougast, Smart Reliable Network on Chip for error detection mechanism, IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 22, NO. 2, FEBRUARY 2014.

  5. I. Loi, S. Mitra, T. H. Lee, S. Fujita, and L. Benini. A low-overhead fault tolerance scheme for tsv-based 3D network on chip links. In Proceedings of International Conference on Computer-Aided Design, pages 598602, Nov. 2008.

  6. V. F. Pavlidis and E. G. Friedman. 3-D topologies for networks on- chip. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 15(10):10811090, 2007.

  7. Y. Tsai, F. Wang, Y. Xie, N. Vijay Krishnan, and M.Irwin. Design space exploration for three- dimensional cache. IEEE Transactions on Very Large Scale Integration Systems, 16(4):444 455, 2008.

  8. Rahmani, A.-M.; Latif, K.; Vaddina, K.R.; Liljeberg, P.; Plosila, J.; Tenhunen, H.; , "Congestion aware, fault tolerant, and thermally efficient inter-layer communication scheme for hybrid NoC-bus 3D architectures," Networks on Chip (NoCS), 2011 Fifth IEEE/ACM International Symposium on , vol., no., pp.65- 72, 1-4 May 2011.

  9. Chia-I Chen; Bau-Cheng Lee; Juinn-Dar Huang; , "Architectural exploration of 3D FPGAs towards a better balance between area and delay," Design, Automation & Test in Europe Conference & Exhibition (DATE), 2011 , vol., no., pp.1-4, 14-18 March 2011.

[14]W. Dally and C. Seitz, Deadlock-free message routing in multiprocessor interconnection networks, IEEE Trans. Comput., vol. C-36, no. 5, pp. 547553, May 1987.

[15] M. Hosseinabady, M. Kakoee, J. Mathew, and D. Pradhan, Low latency and energy efficient scalable architecture for massive NoCs using generalized de Bruijn graph, IEEE Trans. Very Large Scale Integer. (VLSI) Syst., vol. 19, no. 8, pp. 14691480, Aug. 2011.

Leave a Reply

Your email address will not be published. Required fields are marked *