# Introduction

etwork-on-Chip designs consist of a number of interconnected devices (e.g. general or special purpose processors, peripherals, memories etc.) where communication between these system modules is achieved by sending packets over an interconnection network. Packets are transported from one to another module via the interconnection networks [1]. Thus, the interconnections among multiple cores on a chip have a significant impact on communication and performance of the chip design in terms of dynamic energy consumption, latency, throughput etc. So topology selection is an important choice in the design of NoC and in Network on Chip topology design is one of the significant factor that affects the net delay of the system and the dynamic communication energy of the system. Topological parameters, like hop count, wiring layout are closely related to the performance and power dissipation of a network [13]. In other words, it can be said that the NoC performance can be greatly affected by the underlying topology and the routing function followed by the communication cores.

NoC architecture can be designed as regular or application specific network topology. Application-specific network topologies are designed to make the system more application centric [2]. As a result, the performance objectives such as power consumption, latency, area of the chip, number of routers in the system can be optimized effectively with the given design constraints [3]. However the loss of modularity in application-specific NoC generally leads to increased time-to-market in comparison to the modular generic standard NoC topologies.

Regular topologies like 2D mesh are more structured and built not considering much about the application characteristics. A regular NoC topology assumes a homogeneous distribution of routers, which leads among others to lower design time and cost. Therefore regular NoC topologies are suitable for mass production due to lower design cost and effort required. Additionally, regular topologies are highly reusable and impose the minimum re-design effort, in case they are employed to different applications.

However, due to the lack of fast paths between remotely situated nodes and many hops between different communicating nodes regular topology architectures for a given routing methodology may suffer from long packet latencies and higher communication power dissipation. Although the regular structure significantly address the implementation issues, such as floor planning, unequal lengths of wire, hence results in poorly controlled electrical parameters.

In order to meet the communication requirements, accelerate time-to-market and cut down the communication energy consumption, there is a great need to find new design alternatives of customizing the regular architectures as per the communication requirements. We therefore focused our research on exploring and finding an efficient methodology to tailor the regular 2D mesh topology so that the dynamic communication energy can be optimized. The methodology will optimize the dynamic communication energy consumption by reducing the hop count or by reducing the number of short links according to the traffic characteristics of the application for a given routing function. Tailoring the regular topology shall reduce the design time in comparison to the design of application specific NoC from stretch with reasonable performance advantages.
1 Year 2016 ( ) E Samta Jain ? , Vaishali Sodani ? & Naveen Choudhary ?
In [4], the authors introduce additional random links to the mesh and torus network and investigate the performance of these networks. In [5], authors present a methodology to automatically synthesize an architecture where a few application-specific long-range links are introduced on a regular mesh network. Drawing inspiration from such work this paper proposes the methodology to insert energy conscious shortcut links with practical design constraints such as link length, port constraints. The proposed design also strives to add minimum number of shortcuts so as not to distinguish the original design in terms of area and floor-planning.

In section 2, NoC communication energy and routing functions are discussed. In section 3, the proposed methodology is given. In the next section, the experimental results are discussed and in the last section the conclusion is presented based on the results.


# II.


# Noc Communication Energy and Routing Function a) Dynamic Communication Energy Model

The dynamic communication energy model [7] for the network on chip can be defined as:
E bit (ti, tj) = n hops × E rbit + ( n hops-1) × E Lbit (1)
Where E bit (ti, tj)

= the average energy consumption for sending one bit of data from t i to tile t j , n hops = the number of routers the bit traverses from tile t i to tile t j , E rbit = the energy consumed by the router for transporting one bit of data, E Lbit = the energy consumed by unit link/channel for transporting one bit of data. For the NoC networks with unequal link length, the ((nhops -1) x E Lbit ) can be replaced as the summation of bit energy consumed by each link in the path/route, the bit follows from source tile to the destination tile [21].

Hop count is the number of routers the bit traverses from source tile to tile destination tile. The communication energy model [7] for the network on chip clearly shows that the latency and power consumption linearly relates to the average number of hops the packet traverse, hence to the network topology. So we will optimize the latency and power consumption by reducing the hop count between the source tile and destination tile for a given routing function.


# b) Routing Function in NoC

Routing function determines how the data is transmitted from source tile to destination tile. Routing algorithm can be grouped according to different criteria. According to the place where routing decision is made, routing algorithm can be classified as centralized, source and distributed routing algorithm. In centralized the route is chosen by a central controller, in source the route is chosen by a source router prior to a sending the packet, in distributed the route is chosen by an intermediary routers [16,17]. According to how a route is defined to route the packets, routing algorithm can be deterministic or adaptive. In deterministic routing, one path is calculated between source and destination and routing is done through that fixed path only.

Deterministic routing algorithm always routes the packets from source to destination along that same path [15]. Deterministic algorithm does not take into account network traffic situation or the network condition before choosing a path from source to destination. In adaptive routing algorithms multiple paths are calculated between source and destination but routing is done in one selected path depending upon the network traffic. In adaptive algorithm network traffic situation (such as network load, traffic condition) are taken into consideration. Limitations of adaptive routing algorithm are its implementation cost, complexity and more power consumption [16]. Deterministic routing algorithm is popular in NoC due to simplicity, low latency, simple routing logic, packet reach destination in correct order and reordering is not necessary [7]. It is a greedy algorithm because it always chooses a shortest path to deliver a packet from source to destination [17].

The design space of efficient deadlock-free routing is an important aspect which is more implicitly affected by the routing algorithm which determines which routes packets can take from source to destination. A deadlock is a situation where packets are involved in a circular wait for resources. Deadlock freedom is an important property for networks, since a packet deadlock can destroy possibilities of communication [18]. Deadlocks in a network may block communication between cores and may even lead to a complete failure of network. Therefore it is essential to have algorithms which have ability to handle them. The deterministic routing [7] (e.g. XY routing [14,19] and odd-even routing [15,19]) are the most widely used deadlock free routing algorithms for the popular 2Dmesh based on chip networks.


# III.


# Proposed Methodology

In the presented work, an energy conscious topology augmentation methodology is proposed for application specific on-chip interconnection networks. In the proposed methodology a customized NoC topology is synthesized by tailoring the regular 2D mesh topology to achieve energy efficient communication.  1clearly shows that dynamic communication energy linearly relates to the number of routers/hops the packet traverses from the source tile to the destination tile i.e. as the routers in communication path increases, the communication energy also increases. In the regular grid like NoC architecture, there is lack of fast path between two distant nodes so the packets have to traverse many hops between any two remotely situated nodes (source and destination), leading to poor dynamic communication energy dissipation. If we can able to reduce the number of hops in the communication path for a chosen routing function, the considerable amount of energy saving can be achieved in the regular NoC architecture. The reduction in the number of hops in communication path can be done by augmenting the regular mesh by inducing significant shortcut links according to the application characteristics and also constraining parameters such as length of the shortcut link and the port availability per tile. The length of the shortcut link is constrained because in NoC communication generally needs to be clock synchronized and to avoid the long wiring complexity and area overhead short links are generally preferred. The port availability per tile means add the shortcut link to a tile only if there free port is available. The augmentation of regular topology can be done by adding the significant shortcut links to the communication path segments which are carrying maximum traffic load for all the communication/routing paths for a routing function while taking the above mentioned constraints into consideration. In the proposed methodology these significant shortcut links are find out with the help of modified KMP string matching algorithm [12].

Figure 1 gives architecture of the proposed methodology. The proposed methodology augments the regular 2D mesh topology which is customized using any of the deterministic routing function. In the proposed methodology the routing paths (decided according to chosen deterministic routing function) are input to the modified KMP [12] string matching algorithm. Here this algorithm actually dealing with the routing paths as a string. This algorithm find out the path segments of 3 consecutive and 2 consecutive links (in other words of 4 hops and 3 hops respectively) which are common in other routing paths and also gathers non common path segments. In the next step the proposed methodology arranges all these segments (common sub paths of 4 & 3 hops and non-common segments) in the descending order of total communication energy consumed by these segments. This ordering has been done so that the segments with higher dynamic communication energy consumption will be considered earlier or in other words the segments which carry the maximum traffic load can be addressed earlier, so that maximum benefits can be achieved. After that, it will add the shortcut link to the routing paths which contains that segment in the regular 2D mesh topology according to the order by keeping in check the input constraints (port availability per tile). Here adding the shortcut link to segment of 3 consecutive links or to segment of 2 consecutive links means skipping 2 hops or 1 hop in the corresponding routing path respectively. Hence by doing this we are constraining the length of additional link (shortcut link) in the augmented topology. This augmented topology is referred as shortcut inclusive 2D mesh (SI-2D mesh). For the analysis of various performance parameters SI-2D mesh is validated on NoC simulator NC-G-SIM [20].


# IV.


# Experimental Results

The performance of SI-2D mesh derived from the regular 2D mesh is validated on a network on chip simulator NC-G-SIM. It is a discrete event cycle accurate simulator for Network-on-Chip performance simulation [20]. NC-G-SIM supports regular as well as irregular topology framework with source and table based routing. E Lbit is calculated according to the analytical energy model presented in [8] [9]; E rbit for a router is calculated the help of power simulator Orion [10] for 0.18?m t echnology. The c ommunication is as sumed t o be of constant bit rate. Number of application (traffic) characteristics data sets were randomly generated using TGFF [11], with diverse bandwidth requirement of the IP Cores and randomly generated communication bandwidth requirement. The simulation is run for 10000 clock cycles. The average total communication energy per flit received at destination and average per flit latency are taken as performance metric. The methodology is applied to 6 set of different 2D mesh topology, for each topology 10 different test cases were generated and executed. After having experimented on number of topology sets, the average of obtained results is represented to demonstrate the effectiveness of the proposed methodology. The configuration files for a given traffic pattern are generated using the proposed methodology and supplied to the simulator as an input. As shown in Figure 2, the short cut inclusive 2D mesh (SI-2D mesh) generated using the proposed methodology saves about on average 32.79% of dynamic communication energy consumption compared to regular 2D mesh because SI-2D mesh needs less number of hops to traverse from the source tile to the destination tile, leading to reduced dynamic communication energy and latency. Figure 3 shows comparison in terms of average per flit latency between 2D mesh and SI-2D mesh for various topologies and it depicts that the average latency per flit gets reduced by on average of 16.22% in SI-2D mesh compared to the regular 2D mesh . It is clear that we are getting marginal reduction in latency as compared to the dynamic communication energy; the reason behind this is the occurrence of the congestion due to the generation of hotspot at the induced routers which are heavily loaded as they are bearing the traffic load which was earlier getting spread over multiple routers. However still there is gain in latency as the switching delay in path from source to destination for various application characteristics is reduced due to decrease in number of routers in the corresponding paths.

V.


# Conclusion

In this paper, a greedy energy conscious topology augmentation methodology for on-chip interconnection networks is proposed. Using the proposed methodology a shortcut inclusive topology is designed according to the traffic characteristics. The proposed methodology is greedy but fast and uses fast modified KMP string matching algorithm [12] for exploring the effective and efficient shortcuts to be augmented in the given regular architecture. The experimental result clearly shows that shortcut inclusive topology generated using proposed methodology is able to reduce dynamic communication energy leading to significant energy savings and also reduction in the average per flit latency as compared to the regular topology.
1![Figure 1 : Architecture of Proposed Methodology Equation1clearly shows that dynamic communication energy linearly relates to the number of routers/hops the packet traverses from the source tile to the destination tile i.e. as the routers in communication path increases, the communication energy also increases. In the regular grid like NoC architecture, there is lack of fast path between two distant nodes so the packets have to traverse many hops between any two remotely situated nodes (source and destination), leading to poor dynamic communication energy dissipation. If we can able to reduce the number of hops in the communication path for a chosen routing function, the considerable amount of energy saving can be achieved in the regular NoC architecture. The reduction in the number of hops in communication path can be done by augmenting the regular mesh by inducing significant shortcut links according to the application characteristics and also constraining parameters such as length of the shortcut link and the port availability per tile. The length of the shortcut link is constrained because in NoC communication generally needs to be clock synchronized and to avoid the long wiring complexity and area overhead short links are generally preferred. The port availability per tile means add the shortcut link to a tile only if there free port is available. The augmentation of regular topology can be done by adding the significant shortcut links to the communication path segments which are carrying maximum traffic load for all the communication/routing paths for a routing function while taking the above](image-2.png "Figure 1 :")
2![Figure 2 : Average total dynamic communication energy per flit comparative results between 2D mesh and SI-2D mesh topology](image-3.png "Figure 2 :")
3![Figure 3 : Average per flit latency comparative results between 2D mesh and SI-2D mesh](image-4.png "Figure 3 :")
		
		
* 
	
		Route packets, Not wires: On-chip interconnection networks
		
			WJDally
		
		
			BTowles
		
	
		Design Automation Conference, Proceedings
				
			IEEE
			2001
			
		
* 
	
		Energy, Throughput and Area Evaluation of Regular and Irregular Network on Chip Architectures
		
			SUmamaheswari
		
		
			PRajapaul
		
		
			J
		
	
		Journal of Distributed and Parallel Systems
		
			2
			5
			2011
		
	
	IJDPS


* 
	
		Floorplanning and Topology Generation for Application-Specific Network-on-Chip
		
			YuBei
		
		
			DongSheqin
		
	
		Design Automation Conference (ASP-DAC), 2010 15th Asia and South Pacific
				
			IEEE
			2010
			
		
* 
	
		Performance of data networks with random links
		
			HFuks
		
		
			ALawniczak
		
	
		Mathematics and Computers in Simulation
		
			51
			1999
		
	
* 
	
		
			YOgras Umit
		
		
			RMarculescu
		
		
			2005
		
	
* 
	
		Application Specific Network-on-Chip Architecture Customization via Long Range Link Insertion
		
	
* 
	
		It's a Small World After All: NoC Performance Optimization via Long Range Link Insertion
		
			YOgras Umit
		
		
			RMarculescu
		
	
		IEEE
		
			14
			7
			2006
		
	
* 
	
		Energy and performance-aware mapping for regular NoC architectures
		
			JHu
		
		
			RMarculescu
		
	
		Computer-Aided Design of Integrated Circuits and Systems
		
			24
			
			2005
		
	
* 
	
		and South Pacific Design Automation Conference
				
			
* 
	
		GA based congestion aware topology generation for application specific NoC
		
			NChoudhary
		
		
			MSGaur
		
		
			VLaxmi
		
		
			VSingh
		
	
		Sixth IEEE International Symposium Electronic Design, Test and Application (DELTA)
				
			2011
			
		
* 
	
		
			ABKahng
		
		
			BLi
		
		
			LSPeh
		
		
			KSamadi
		
		
			2009
		
	
* 
	
		Orion 2.0: A fast and accurate NoC power and area model for early-stage design space exploration
	
	
		Proceedings of the conference on Design, Automation and Test in Europe
				the conference on Design, Automation and Test in Europe
		
			
* 
	
		
			RPDick
		
		
			DLRhodes
		
		
			WWolf
		
		
			1998
		
	
* 
	
		TGFF: task graphs for free
	
	
		IEEE Computer Society, Proceedings of the 6th international workshop on Hardware/software co-design
				
			
* 
	
		Introduction to Algorithms
		
			HThomas
		
		
			CharlesECormen
		
		
			RonaldLLeiserson
		
		
			CliffordRivest
		
		
			Stein
		
		
			2009
			The MIT Press
			Cambridge, Massachusetts London, England
		
	
	3 rd edition


* 
	
		Energy consumption in networks on chip: efficiency and scaling
		
			BGeorge
		
	
		IEEE 15 th International Symposium
				
			2009
			
		
* 
	
		Evaluation of Pseudo Adaptive XY Routing Using an Object Oriented Model for NOC
		
			MDehyadgari
		
		
			MNickray
		
		
			AKusha
		
		
			ZNavabi
		
		
			2005
		
	
* 
	
		Interconnection Networks: An Engineering Approach
		
			JDuato
		
		
			SYalamanchili
		
		
			LNi
		
		
			2003
			Morgan Kaufmann Publishers
			San Francisco
		
	
* 
	
		Performance evaluation of different routing algorithm in network on chip master dissertation National Institute of Technology Rourkella
		
			JKSingh
		
		
			Odesha
		
	
* 
	
		
			GAdamu
		
		
			PChejara
		
		
			AGarko
		
		
			B
		
		Review of deterministic routing algorithm for network-onchip. International conference on science, technology and management
				
			2015
		
	
* 
	
		Deadlock Free Routing Algorithms for Mesh Topology NoC Systems with Regions
		
			RHolsmark
		
		
			MPalesi
		
		
			SKumar
		
	
		Digital System Design: Architectures, Methods and Tools, 9th EUROMICRO Conference
				Dubrovnik
		
			2006
			
		
* 
	
		Network-on-Chip: A New SoC Communication Infrastructure Paradigm
		
			NChoudhary
		
	
		International Journal of Soft Computing and Engineering (IJSCE)
		2231-2307
		
			1
			
			2012
		
	
* 
	
		NC-G-SIM: A Parameterized Generic Simulator for 2D-Mesh, 3D Mesh & Irregular On-chip Networks with Table-based Routing
		
			KVyas
		
		
			NChoudhary
		
		
			DSingh
		
	
		Global Journal of Computer Science and Technology (GJCST-E) on Network, Web & Security
		
			13
			
			2013
		
	
* 
	
		Energy Efficient Mapping in 3D Mesh Communication Architecture for NoC
		
			PWadhwani
		
		
			NChaudhary
		
		
			DSingh
		
	
		Global Journal of Computer Science, and Technology (GJCST-E) on Network, Web & Security
		
			13
			
			2013
		
	
* 
	
		Energy-aware mapping for tile-based NoC architectures under performance constraints
		
			JHu
		
		
			RMarculescu
		
	
		Proceedings of the Asia
				the Asia
		
			2003