A flexible tree-based platform for design space ... - Semantic Scholar

A flexible tree-based platform for design space exploration of hierarchical FPGA architectures Michiel De Wilde, Joni Dambre and Dirk Stroobandt Abstract— For many years, research on FPGA-type programmable hardware architectures has focused mainly on optimising regular non-hierarchical architectures. In the exploration of their design space, some design parameters have a significant impact on the layout area, which is directly related to interconnect delay. An estimation of this impact can be derived from a prediction of the area of the basic FPGA building blocks. For non-hierarchical FPGA architectures, reliable area predictions are easily achieved. However, recent FPGAs are becoming ever more complex: the architectures are often hierarchical and contain embedded higher-level components. Here, a priori area estimation is much more complicated. In this paper, we present the development of a generic tree representation for hierarchical FPGA architecture layouts. This representation can be used to derive reliable area estimations. Integrated in a partial design flow, this provides a framework for the exploration of different architectural parameters as well as layout options. Keywords— hierarchical FPGA architectures, architecture modelling, design space exploration

I. Introduction Field programmable gate arrays (FPGAs) are chips that can be programmed in situ to structurally emulate a digital circuit. For many years, FPGA research has focused mainly on optimising non-hierarchical island-style or rowbased architectures. These architectures consist of a regular two-dimensional array of functional blocks (figure 1a), containing some LUTs (‘LookUp Tables’, they realise a logic function in hardware) and flipflops (memory elements). In island-style FPGAs, horizontal and vertical programmable interconnect lines are available between tiles, while in row based FPGAs, basic tiles are organised in rows, separated by programmable interconnect. For these very repetitive types of FPGA architectures, exploration of design parameters (such as number of LUTs, number of flipflops or number of interconnect resources) is relatively simple, since the number of different options is limited. The most tried approach is to freeze the basic tile design and vary each parameter separately. For each alternative, some typical applications are implemented in the FPGA architecture. The architectural variants are then compared, based on average results for delay, LUT utilisation and routing resource utilisation. The former is obviously a performance measure, while the latter is an indication for area efficiency. When routing resource utilisation is low, the number of interconnects can be reduced, resulting in a decrease of the total chip area. Since layout area affects wire capacitances, it also has a direct impact on interconnect delay. An estimation of this impact can be deMichiel De Wilde is a Research Assistant of the Fund for Scientific Research – Flanders (Belgium)(F.W.O.)

Configurable interconnections

embedded memory block

Excess area Functional block I/O blocks

(a)

embedded memory block

(b)

Fig. 1 (a) Schematic representation of a (very small) island-style FPGA architecture and (b) Excess area inside a step-repeated tile of a hierarchical FPGA architecture layout

rived from a prediction of the basic tile area. For standard non-hierarchical FPGA architectures, this can be achieved simply by summing the area estimates of all components in a basic tile and the routing resources. However, FPGAs are becoming ever more complex. State of the art architectures are often hierarchical and contain embedded higher-level components, such as embedded memory blocks and multipliers available in commercial FPGAs [1], [2]. In such hierarchical architectures, a priori area estimation is much more complicated. For instance, figure 1b represents a single large square tile, which is steprepeated to create a full architecture. If the area needed to implement the smaller basic tile (including connection resources) is less than a quarter of that of an embedded memory block, some excess area will be allocated to each of the smaller tiles in order to nicely fit four of them in the spaces between the memory blocks. We have established a generic tree representation for hierarchical FPGA architecture layouts. This representation can be translated to a standard geometric programming formulation to derive area efficiency estimations. Integrated in a partial design flow, it provides a framework for the exploration of architectural parameters and layout options. In the next section, we briefly describe our tree representation and its place in a partial design flow. Then, we discuss how different architectural parameters, as well as layout options can be explored in section III and finish with some conclusions and intentions for future work. II. Tree representation of hierarchical FPGA architectures As an implication of the highly regular aspect of nonhierarchical FPGAs, their layout can be decomposed in rectangular subregions, corresponding to the basic elements in the FPGA architecture. In hierarchical FPGAs, such a decomposition can generally be done at each hierarchy level, with larger rectangles representing collections

architectural style definition A

preliminary floorplanning B parameterized tree representation

evaluation of results C

choice of design parameters

area estimation

Fig. 3 Flowchart for design space exploration of hierarchical FPGA architectures

Fig. 2 Tree representation of a hierarchical FPGA layout

of smaller tiles. This decomposition can be represented in a tree structure. The nodes in the tree are called tiles because of their rectangular shape, while the leaf nodes are called basic tiles. Our tree representation also includes the way the different tiles within a hierarchy level are interconnected, as well as their relative positions within the two-dimensional layout plane (figure 2). For each basic tile, an estimate of its minimal layout area has to be provided and constraints on its aspect ratio can be given (e.g., to aim at similar delays for horizontal and vertical interconnect). However, exact rectangle dimensions are not required. This leaves some flexibility for area optimisation by reducing the required introduction of empty space to a minimum. To obtain rectangle dimensions that minimize chip area, our tree representation can be translated into a generic geometrical programming formulation, for which efficient solutions exist [3]. These dimensions can then serve as input constraints for the full-custom physical design of the tile hierarchy. However, we have also allowed several tile properties to be parameterised, to support the exploration of different tile parameters, using a single tree topology. In the next section, we will explain how this platform can be used to explore and evaluate different design alternatives without the need to go through the entire design cycle. III. Rapid exploration of FPGA design space To obtain an area estimate for a given FPGA architecture, the following steps have to be taken (figure 3). First, an architectural style has to be defined. This consists of the architecture’s structural hierarchy and its interconnect pattern. For instance, if large memory blocks are introduced, the first stage fixes their frequency and their number of I/O ports. However, in this stage, neither their exact size, nor their I/O bandwidth needs to be fixed. Second, the relative position of the different tiles (e.g., memory block versus array of functional blocks) on the layout medium has to be decided (floorplanning). After these two stages, a parameterised tree representation of the architecture can be derived. Before an area estimate can be obtained, the final details have to be filled in to fix minimal area esti-

mates for the basic tiles and the interconnect width at tile I/O ports. For exploring design alternatives such as size of the memory blocks, complexity of the functional blocks or number of interconnect resources, only lowest level iterations in this flow are needed (loop C in figure 3). Alternatives that result in different hierarchical structures require global iterations (loop A). Finally, intermediate iterations (loop B) can be used to explore and optimise the relative tile positions within a hierarchy level (global layout optimisation). Whatever the option explored, each iteration leads to a set of rectangle dimensions. These immediately result in an estimate of the chip area and of the area efficiency (amount of empty space that is introduced). Furthermore, they also lead to wire length estimations for inter-tile interconnections as well as maximal lengths for wires within a tile. These can be used in either a quick average delay estimator, or in a timing-driven tool for implementing test applications. If a performance measure can be attached to a given design parameter, this exploration flow can also produce Pareto-curves to explore area-performance trade-offs. IV. Conclusions and future work We have presented the development of a generic tree representation for hierarchical FPGA architecture layouts, modelling the interconnections between different hierarchical FPGA layout constituents and their relative position. We have indicated that our representation can be translated into a feasible optimisation problem to derive FPGA layout area and wire length estimations. This prediction system can be integrated in a partial design flow, providing a framework for the exploration of different architectural parameters as well as layout options. In the near future, we will develop an interactive application implementing our models and predictions to become part of a forthcoming tool chain for hierarchical FPGA design space exploration. References [1] Xilinx Inc., “Virtex-II platform FPGA data sheet,” 2001. [2] Altera Corporation, “APEX II programmable logic device family data sheets,” 2001. [3] J. Rajgopal and D.L. Bricker, “An algorithm for posynomial geometric programming, based on generalized linear programming,” Tech. Rep. 95-10, Department of Industrial Engineering, University of Pittsburgh, 1995.