Architecture

SMART Crossbar

The SMART crossbar is the primary building block in a SMART NoC that enables straight and turning paths within the network.

The idea is to insert a crossbar between the Rx and Tx components of each repeater.

The data sent on the link will first be converted to full-swing (Rx), traverse the full-swing crossbar, then converted back to low-swing again (Tx) and forwarded to the next hop.

Router Microarchitecture

 

three primary components:

Buffer Write enable (BW_ena): determine if the input signal is latched or not

Bypass Mux select (BM_sel): choose between the local buffered flit, and the bypassing flit on the link

Crossbar select (XB_sel)

Routing

Since the routes are static, we adopt source routing and encode the route in 2 bits for each router.

At the source router, the 2-bit corresponds to East, South, West and North output ports, while at all other routers, the bits correspond to Left, Right, Straight and Core.

The direction Left, Right and Straight are relative to the input port of the flit.

In this work, we avoid network deadlocks by enforcing a deadlock-free turn model across the routes for all flows.

Flow control

A router needs to keep track of free VCs at the endpoint of an arbitrary SMART route, though it does not know the SMART route till runtime.

We solve this problem by using a reverse credit mesh network, similar to the forward data mesh network that delivers flits.

The only overhead of the credit mesh network is a [log(# VCs) + 1 (valid)]-bit SMART crossbar added at each router.

For example, if the number of VCs is 2, the overhead of the credit network is 2-bit wide crossbars. If a forward route is preset, the reverse credit route is preset as well.

A credit that traverses multiple hops does not enter the intermediate routers and goes directly to the SMART crossbar which redirects it along the correct direction.

Low-swing signaling
In general, the low-swing technique can lower energy consumption and propagation delay at the cost of a reduced noise margin.

The heart of our SMART NoC is a novel low-swing clockless repeated link circuit (asynchronous repeaters, a pair of inverters) embedded within the router crossbars, that allows packets to potentially bypass all the way from source to destination core within a single clock cycle, without being latched at any intermediate router.

Replacing clocked link drivers by asynchronous repeaters at every hop.

 

HPC_max

The maximum number of bypass hops, or maximum hops-per-cycle (HPC_max),

is a design-time parameter, constrained by the clock period of system, tile size, and the wire delay of data links between routers.

SMART router pipeline

SA-L (Switch Allocation Local): every start router chooses a winner for each output port from among its buffered (local) flits.

SSR: they broadcast a SMART-hop setup request (SSR) via dedicated repeated wires up to HPC_max; the SSR carries the length (in hops) up to which the flit winner wishes to go.

SSR = min(HPC_max, H_remaining)

SA-G: all inter routers arbitrate among the SSRs they receive, to set the BW_ena, BM_sel and XB_sel signals

arbitration policies:

Prio=Local: Local flits have higher priority over bypass flits, i.e. Priority = 1/(hops_from_start_router).

Prio=Bypass: Bypass flits have higher priority over local flits, i.e. Priority = (hops_from_start_router).

 Implementation of SA-G at W_in and E_out

SA-G SSR-priority-arbiter arbitrates the received SSRs from W->E dimension and chooses the nearest SSR.

SA-G output port checks whether there is a request from local buffered flits. If not, the signal XB will be asserted.

In the stage SA-G input port, if there is no transmitting packets, the bypass request will be granted.

ST+LT: SA-L winners that also won SA-G at their start routers traverse the crossbar and links upto multiple hops till they are stopped by BW_ena at some router.

In summary, a SMART NoC works as follows:

  • Buffered flits at injection/start routers arbitrate locally to choose input/output port winners during SA-L.
  •  SA-L winners broadcast SSRs along their chosen routes, and each router arbitrates among these SSRs during SA-G.
  •  SA-G winners traverse multiple crossbars and links asynchronously within a cycle, till they are explicitly stopped and buffered at some router along their route.
原文地址:https://www.cnblogs.com/cpsmile/p/8324692.html