Kia's group

Current / Past Research

Last updated: Jun 29, 2005


[2005] 3D Mixed-Signal Chip (more details)
In collaboration with Prof. Harjani's group, we designed 3D view of the chip a mixed-signal chip based on MIT Lincoln Lab's 3D IC technology. Our chip has three active device layers as shown in the figure (click on the image for a more detailed view). Tiers A and B contain digital logic, and Tier C has a mixture of digital and analog blocks. (Tier refers to one active device layer and three metal layers). The digital logic is a regular fabric of programmable multiply-accumulate (MAC) units, plus ring oscillators and PN-junctions for temperature sensing. The analog part contains two LNA circuits, one of which is shielded using a Faraday cage, and the other isolated through traditional 2D techniques.

The chip was taped out in early April, 2005. The goal is to study performance benefits of the 3D technology (there are vias connecting logic between tiers A, B and C), study the effect of task scheduling on the thermal profile of the chip, and to observe the efficiency of the isolation opportunities provided by the 3D technology for the analog circuits.
More details
Related Publications 

[2005] HARP: HArd-wired Routing Pattern FPGAs  (more details)
harp conceptual figTraditionally, FPGA architects provide users with ample routing resources so that a large range of circuit families can be mapped to the FPGA device. Such great flexibility comes at a high price in terms of area, power and delay. An alternative approach is CMU's VPGAs. However, via- programmability does not offer in-system programmability and very low non-recurring engineering costs. Our solution is to provide an architecture that takes away some of the flexible switches of an FPGA and replaces them with hard-wired patterns (see the figure on the right). We have implemented a placement and routing tool, which is implemented on top of Steve Wilton's power model tool. Our modification can perform placement and routing for HARP FPGAs and get power consumption estimates of the architecture. We placed and routed circuits with HARP patterns embedded in the switch boxes and reported savings in circuit delay, energy and area by about 25%, 31% and 5% respectively.

More details
Related Publications 
Download Tool

[2004-2005] TPR: Three-Dimensional Placement and Routing for FPGAs  (more details)
Three dimensional chips are becoming a viable option. Wafer-stacked or 3D grown dies create multiple active layers with tight interconnects. We have modified the VPR tool to perform placement and routing for 3D FPGAs of the future. The tool can be modified to represent various architectural options (e.g., 3D switchbox configuration). It utilizes efficient heuristics such as partitioning of the design between layers and generating a high quality initial solution by performing a linear placement optimization that rearranges the layers for optimal cutsize and wire length.

More details
Related Publications 
Download Tool

[2003] PPFF: Partitioning-Based Placement For FPGAs  (more details)
We developed a partitioning-based placement tool for FPGAs which is about 4 times faster than TVPR (annealing based, de facto academic / research FPGA placement and routing tool for two decades), but with the same quality.  The tool first profiles the routing behavior by analyzing routing resource usage after detailed routing of selected circuits. It employs an effective "terminal alignment" heuristic during top-down partitioning to improve wire length and delay. The main novelty of this work is that it breaks with the traditional indirect quality metric of wire length and tries to model the delay as a function of number of switches used in an FPGA route.

More details
Related Publications 
Download Tool

[2002] Design of a Hard-Disk Read Channel Simulator on an FPGA (details)
We implemented a hard-disk read channel simulator that modeled various noise sources. The design was implemented on a Xilinx Virtex 1000E chip and occupied 99% of the chip area. The generator simulates in hardware the noise processes and distortions observed in hard drives. It uses embedded nonuniform random number generators to simulate the random characteristics of various disturbances in the read/write process. The implementation operates at a 70-MHz clock speed, but even with such low clock speed, the implementation is approximately three orders of magnitude faster than the software implementation. The generator can be reconfigured in real time to give the user flexibility and increase the capacity of the FPGA device. The readback-signal generator can be integrated into an FPGA read channel simulator or serve as a test bench for data-recovery circuits.
[2000] Simultaneous Static Scheduling and Placement in Reconfigurable Computing Systems (RCS) (details)
We have proposed a greedy scheduling method for statically reconfigurable systems. The method first tries to speculate how useful each module in the IP library is, and adds the one with maximum gain. A look-ahead ordering of the data flow graph (DFG) nodes is used with list scheduling algorithm to schedule the operations on selected resources. A placement is simultaneously built to ensure allocated resources fit on the reconfigurable functional unit (RFU). We used Xilinx's Virtex Coregen IP modules for our experiments, and tested our algorithms within the flow of Xilinx’s commercial tool flow (i.e., added one extra step within the flow of Xilinx’s placement and routing flow) to ensure that our optimization process can cater to the complexities of commercial FPGA architectures.

More details
Related Publications 
 
[1999-2000] Fast Placement Methods for Reconfigurable Computing Systems (details)
One of the serious hurdles in realizing efficient reconfigurable computing systems (RCS) is the placement and routing of the RFUOPs (reconfigurable functional unit operations) on the RFU. Physical design CAD tools for RCS are virtually non-existent. In this project, we try to find fast methods for placing RFUOP modules on RFU as compact as possible. 
 
Online Placement (details)
If the flow of the program cannot be predicted accurately at compilation time, then we need some runtime support system to manage RFU configuration. We have proposed algorithms for the placement engine of such a system. We have devised two categories of placement methods, one with high quality but slower, and another one with little quality degradation but much faster. 
 
Offline 3D Placement (details)
If the flow of a program can be accurately predicted at compile time, there will be many opportunities for optimizing the configuration of the RFU. By knowing the order in which RFUOPs start and finish operation, we can explore different subsets of RFUOPs to be placed on RFU as well as their locations.

In this project, we have tried annealing and greedy methods for offline placement. We have proposed a new method which is much faster and generates better placements. 
 
Using Firm Templates in 3D Placement (details)
We have observed that by using more than one shape in the library for each module, we can gain considerably in placement quality. If a module cannot be placed at its first shape, the other dimensions of the same module in the library are tried. Whichever can be placed will be chosen.

More details
Related Publications
Presentation Slides 
 

[1998-1999] Floorplanning Under Uncertainty for ASIC Designs (details)
The design cycle of complex VLSI circuits, such as processors, could be as long as 4-5 years. Some of the time consuming stages of design, such as clock tree routing and heat dissipation estimation, cannot start before a floorplan is given. On the other hand, the floorplanning cannot be done unless all the modules in the system are designed (floorplanning needs shape, connectivity, pin location, etc. for each module).

The fact that designers reuse their previous designs extensively can help us compile a list of probabilistic information on physical properties of the modules in the new design. For example, suppose we had used a filter in our previous design with area A, and we are going to use a more complex filter in the new project. We might be able to estimate the area of the new filter as 1.2A with 70% probability and 1.3A with 30% probability.

Based on the above idea, we have proposed a floorplanner called Nostradamus, which takes a list of probabilistic width/height data for each module, and tries to generate a floorplan which is tolerant to changes in the library.

More details
Related Publications
Presentation Slides 
 
[1999-2001] Flexibility-Based Partitioning (details)
We propose a top-down partitioning-based floorplanning that tries to balance flexibility of the modules, as well as area, across partitions, while minimizing the cut cost. A flexible module is one that can take many shapes. We define some heuristic functions to quantify the flexibility of the modules. Distributing flexibility across partitions makes the sizing problem almost trivial. The quality of our method is comparable to the annealing method (5% worse on the average), but about 1000 times faster.

More details
Related Publications

Back to home.