TCAD Newsletter - Feb 2011 Issue
Placing you one click away from the best new CAD research!


SPECIAL SECTION ON THE 2010 INTERNATIONAL SYMPOSIUM ON PHYSICAL DESIGN
==============

Guest Editorial

Saxena, P.  Chang, Y.-W. Guest Editorial
The six selected papers in the Special Section exemplify the broad scope of
modern physical design (PD), covering system-level physical synthesis, layout
optimization, analog circuit placement, beyond-die routing, biochip design
automation, and emerging device performance optimization.

Wang, R.  Zhang, Y.  Chou, N.-C.  Young, E. F. Y.  Cheng, C.-K.  Graham,
R. “Bus Matrix Synthesis Based on Steiner Graphs for Power Efficient
System-on-Chip Communications”

Power consumption and the thermal wall have become the major factors limiting
the speed of very-large-scale integration (VLSI) circuits, while interconnect
is becoming a primary power consumer. These factors bring new demands on the
communication architecture of system-on-chips (SoCs). High bandwidth is desired
to enhance parallelism for better performance, and the power efficiency on this
bandwidth is critical to the overall SoC power consumption. Current bus
architectures such as AMBA, Coreconnect, and Avalon are convenient for
designers but not efficient on power. This paper proposes a physical synthesis
scheme for on-chip buses and bus matrices to minimize the power consumption,
without changing the interface or arbitration protocols. By using a bus gating
technique, data transactions can take shortest paths on chip, reducing the
power consumption of bus wires to minimal. Routing resource and bandwidth
capacity are also optimized by the construction of a shortest-path Steiner
graph, wire sharing among multiple data transactions, and wire reduction
heuristics on the Steiner graph. Experiments indicate that the gated bus from
our synthesis flow can save more than 90% dynamic power on average data
transactions in current AMBA bus systems, which is about 5–10% of total SoC
power consumption, based on comparable amount of chip area and routing
resources.

Eick, M.  Strasser, M.  Lu, K.  Schlichtmann, U.  Graeb, H. E. “Comprehensive
Generation of Hierarchical Placement Rules for Analog Integrated Circuits” 

This paper presents a new method to automatically generate hierarchical
placement rules, which are crucial for a successful analog placement. The
method is based on a novel symmetry computation method, introducing the
structural signal flow graph. Five types of proximity, matching and symmetry
constraints are determined. According to the priority of the constraint types,
a constraint requirement graph and a hierarchical partitioning of the circuit
into matching, proximity and symmetry groups is then automatically computed.
Based on experimental results with a state-of-the-art placement tool, we show
that the new approach generates more placement rules and can lead to better
circuit performance and parametric yield according to post-layout simulation.

Ajwani, G.  Chu, C.  Mak, W.-K. “FOARS: FLUTE Based Obstacle-Avoiding
Rectilinear Steiner Tree Construction”

In this paper, we present an algorithm called FOARS for obstacle-avoiding
rectilinear Steiner minimal tree (OARSMT) construction. FOARS applies a
top-down approach which first partitions the set of pins into several subsets
uncluttered by obstacles. Then an obstacle-avoiding Steiner tree is generated
for each subset by an obstacle aware version of the rectilinear Steiner minimal
tree algorithm FLUTE. Finally, the trees are merged and refined to form the
OARSMT. To guide the partitioning of pins, we propose a novel algorithm to
construct a linear-sized obstacle-avoiding spanning graph which guarantees to
contain a rectilinear minimum spanning tree if there is no obstacle.
Experimental results show that FOARS is among the best algorithms in terms of
both wirelength and runtime for testcases both with and without obstacles.

Luo, L.  Yan, T.  Ma, Q.  Wong, M. D. F.  Shibuya, T. “A New Strategy for Simultaneous Escape Based on Boundary Routing”

Simultaneous escape routing on dense circuit boards is a very challenging task
and a great amount of manual effort is still needed in order to achieve high
routability. In this paper, we present a new simultaneous escape routing
algorithm which is based upon a novel boundary routing approach. Our algorithm
can solve complicated escape problems in a very short time. For a set of
industrial escape problems, our algorithm successfully solved all of them while
Cadence Allegro PCB router was only able to complete the routing of half of the
problems. In addition, we propose a clustering strategy targeting at large
escape routing problems. Experimental results show that this clustering
strategy can significantly cut down the runtime of our router when solving
large problems.

Huang, T.-W.  Ho, T.-Y. “A Two-Stage Integer Linear Programming-Based Droplet Routing Algorithm for Pin-Constrained Digital Microfluidic Biochips”

With the increasing design complexities, the design of pin-constrained digital
microfluidic biochips (PDMFBs) is of practical importance for the emerging
marketplace. However, solutions of current pin-count reduction are inevitably
limited by simply adopting it after the droplet routing stage. In this paper,
we propose the first droplet routing algorithm for PDMFBs that can integrate
pin-count reduction with droplet routing stage. Furthermore, our algorithm is
capable of minimizing the number of control pins, the number of used cells, and
the droplet routing time. We first present a basic integer linear programming
(ILP) formulation to optimally solve the droplet routing problem for PDMFBs
with simultaneous multiobjective optimization. Due to the complexity of this
ILP formulation, we also propose a two-stage technique of global routing
followed by incremental ILP-based routing to reduce the solution space. To
further reduce the runtime, we present a deterministic ILP formulation that
casts the original routing optimization problem into a decision problem, and
solve it by a binary solution search method that searches in logarithmic time.
Extensive experiments demonstrate that in terms of the number of the control
pins, the number of the used cells, and the routing time, we obtain much better
achievement than all the state-of-the-art algorithms in any aspect.

Lin, Y.-W.  Marek-Sadowska, M.  Maly, W. P. “On Cell Layout-Performance
Relationships in VeSFET-Based, High-Density Regular Circuits”

In this paper, we study circuits implemented using high-density arrays composed
of vertical slit field effect transistors. This layout style could dramatically
increase transistor density and, therefore, reduce fabrication cost. However,
its geometrical restrictions, imposed by the super-regular transistor
arrangement and strictly parallel metal tracks, pose new design challenges. Our
experiments reveal that very dense cell-level interconnect pattern may be
responsible for unnecessary 15% increase of the circuit level, critical path
delays. We demonstrate that these extra delays can be avoided by constructing
appropriate cell interconnect layouts and by more flexible usage of available
metal layers for intra-cell routing. To balance the performance and metal layer
usage, we propose a linear programming-based technique for critical net
re-routing.

REGULAR PAPERS
EMERGING TECHNOLOGIES

Ben-Jamaa, M. H.  Mohanram, K.  De Micheli, G. “An Efficient Gate Library for
Ambipolar CNTFET Logic”

Recently, several emerging technologies have been reported as potential
candidates for controllable ambipolar devices. Controllable ambipolarity is a
desirable property that enables the on-line configurability of n-type and
p-type device polarity. In this paper, we introduce a new design methodology
for logic gates based on controllable ambipolar devices, with an emphasis on
carbon nanotubes as the candidate technology. Our technique results in
ambipolar gates with a higher expressive power than conventional complementary
metal-oxide-semiconductor (CMOS) libraries. We propose a library of static
ambipolar carbon nanotube field effect transistor (CNTFET) gates based on
generalized NOR-NAND-AOI-OAI primitives, which efficiently implements XOR-based
functions. Technology mapping of several multi-level logic benchmarks that
extensively use the XOR function, including multipliers, adders, and linear
circuits, with ambipolar CNTFET logic gates indicates that on average, it is
possible to reduce the number of logic levels by 42%, the delay by 26%, and the
power consumption by 32%, resulting in a energy-delay-product (EDP) reduction
of 59% over the same circuits mapped with unipolar CNTFET logic gates. Based on
the projections in , where it is stated that defect-free CNTFETs will provide a
5times performance improvement over metal-oxide-semiconductor field effect
transistors, the ambipolar library provides a performance improvement of
7times, a 57% reduction in power consumption, and a 20times improvement in EDP
over the CMOS library.

MODELING AND SIMULATION

Rizzoli, V.  Masotti, D.  Mastri, F.  Montanari, E. “System-Oriented
Harmonic-Balance Algorithms for Circuit-Level Simulation”

This paper discusses a self-consistent set of modern computational concepts
providing an effective approach to the circuit-level harmonic-balance (HB)
simulation of nonlinear microwave systems of complex topology. The system is
automatically split into the interconnection of a near-optimal number of
nonlinear blocks at run time. The resulting structure is then exploited by the
domain-partitioning concept. A block-wise constant spectrum is used rather than
a common spectrum by considering for each block only the set of spectral lines
that are relevant to its electrical function, which leads to a very significant
reduction in the number of problem unknowns. System simulation under digitally
modulated RF drive is reduced to a sequence of modified multitone HB analyses
that are backward coupled through the envelope dynamics. Besides providing high
numerical efficiency, this set of techniques opens the way to an effective
co-simulation of RF and baseband transceiver sections.

Gong, M.  Zhou, H.  Li, L.  Tao, J.  Zeng, X. “Binning Optimization for
Transparently-Latched Circuits”

With increasing process variation, binning has become an important technique to
improve the values of fabricated chips, especially in high performance
microprocessors where transparent latches are widely used. In this paper, we
formulate and solve the binning optimization problem that decides the bin
boundaries and their testing order to maximize the profit (considering the test
cost) for a transparently-latched circuit. The problem is decomposed into four
sub-problems. First, to compute the clock period distribution of the
transparently-latched circuit, a sample-based statistical static timing
analysis (SSTA) approach is developed which is based on the generalized
stochastic collocation method with the sparse grid technique. The minimal clock
period on each sample point is found by solving a minimal cycle ratio problem
in the constraint graph. Second, a greedy method is proposed to maximize profit
considering both the sales revenue and the test cost by iteratively assigning
each boundary to its optimal position. Third, an optimal algorithm of O(nlog n)
runtime is used to generate the optimal testing order to minimize the test
cost, based on alphabetic tree. Last, a simple approach is presented to decide
the optimal number of bins, which helps to complete the whole binning scheme
with maximal profit. Experiments on all the ISCAS'89 sequential benchmarks with
65 nm technology show 10.68% profit improvement in average. Some comparisons
with other methods suggest the advantage of our method. The results also
demonstrate that the proposed SSTA method achieves an error of 0.70% and
speedup of 110 X in average compared with the Monte Carlo simulation.

PHYSICAL DESIGN

Hsu, C.-H.  Chang, Y.-W.  Nassif, S. R. “Simultaneous Layout Migration and
Decomposition for Double Patterning Technology”

Double patterning technology (DPT) and layout migration (LM) are two closely
related problems on design for manufacturability in the nanometer era. DPT
decomposes a layout into two masks and applies double exposure patterning to
increase pitch size and, thus, printability. In this paper, we present the
first algorithm in the literature for the simultaneous layout migration and
decomposition (SMD) problem. Our algorithm first constructs a potential
conflict graph and DPT-aware constraint graphs, and then applies integer linear
programming (ILP) corresponding to the graphs to obtain a decomposed and
migrated layout. We further present an effective graph-based reduction
technique to prune the ILP solution space, which maintains the same DPT
conflicts. We also present a new DPT-aware objective for the SMD problem to
minimize the difference between the original and migrated layouts while
considering the DPT effects. In addition, we present an approach to generate
DPT-aware standard cells by considering the DPT effects on the cell boundaries;
this technique improves the layout printability and facilitates electronic
design automation tools to consider DPT. Experimental results show that our
algorithms can effectively generate conflict-free migrated layouts with 11%
smaller layout areas and 21% smaller layout changes, compared with the
traditional method of layout decomposition followed by LM. In particular, our
reduction technique reduces the ILP variables by 45.7%, the ILP constraints by
58.5%, and the DPT edges by 79.9% over the basic ILP formulation, leading to a
substantial speedup. For example, it can reduce the runtimes for the test cases
from more than one day to only seconds.

SYSTEM-LEVEL DESIGN

Wang, F.  Chen, Y.  Nicopoulos, C.  Wu, X.  Xie, Y.  Vijaykrishnan,
N. “Variation-Aware Task and Communication Mapping for MPSoC Architecture”

As technology scales, the delay uncertainty caused by process variations has
become increasingly pronounced in deep submicrometer designs. As a result, a
paradigm shift from deterministic to statistical design methodology at all
levels of the design hierarchy is inevitable. In this paper, we propose a
variation-aware task and communication mapping methodology for multiprocessor
system-on-chips that uses network-on-chip communication architecture so that
the impact of parameter variations can be mitigated. Our mapping scheme
accounts for variability in both the processing cores and the communication
links to ensure a complete and accurate model of the entire system. A new
design metric, called performance yield and defined as the probability of the
assigned schedule meeting the predefined performance constraints, is used to
guide both the task scheduling and the routing path allocation procedure. An
efficient yield computation method for this mapping complements and
significantly improves the effectiveness of the proposed variation-aware
mapping algorithm. Experimental results show that our variation-aware mapper
achieves significant yield improvements over worst-case and nominal-case
deterministic mapper.

SHORT PAPERS

Zhou, T. Y.  Liu, H.  Zhou, D.  Tarim, T. “A Fast Analog Circuit Analysis
Algorithm for Design Modification and Verification”

This paper presents a fast analog circuit analysis algorithm, fundamental
circuit-based circuit analysis, for circuits being repeatedly modified and
verified in product development. The algorithm reuses previous circuit
simulation result on successive changed circuit analysis to achieve simulation
operation reduction. The algorithm is implemented with SPICE simulator on
linear and nonlinear circuit applications with the proposed device delta
models. The experiments show that the algorithm increases the speed of the
circuit simulation five to ten times over directly simulations under the same
simulation accuracy.

Yu, G.  Li, P. “Hierarchical Analog/Mixed-Signal Circuit Optimization Under
Process Variations and Tuning”

A hierarchical optimization methodology is presented to achieve robust
analog/mixed-signal circuit design with consideration of process variations.
Hierarchical optimization using building circuit block Pareto models is an
efficient approach for optimizing nominal performances of large analog
circuits. However, yield-aware system optimization, as dictated by the need for
safeguarding chip manufacturability in scaled technologies, is completely
nontrivial. Two fundamental difficulties are addressed for achieving such a
methodology: yield-aware Pareto performance characterization at the building
block level and yield-aware optimization problem formulation at the system
level. In addition, postsilicon tuning in complex mixed-signal system designs
is investigated and the proposed optimization framework is extended for such
systems. The presented methodology is demonstrated by hierarchical optimization
of a phased-locked loop consisting of multiple building blocks and self-tuning
function blocks.

Yao, C.  Saluja, K. K.  Ramanathan, P. “Power and Thermal Constrained Test
Scheduling Under Deep Submicron Technologies”

Conventional power constrained test scheduling methods do not guarantee a
thermal-safe solution. In this paper, we propose a test scheduling algorithm
that satisfies the resource, power, and thermal constraints. First, in contrast
to existing schemes, the proposed algorithm exploits superposition principle to
perform fast and accurate thermal simulation, which, in turn, allows the
algorithm to search for solutions which introduce cooling periods between tests
to reduce the overall test length. Second, we propose a test partition-based
method to further improve the performance of the test scheduling. We apply our
test scheduling algorithm to ITC'02 SoC benchmarks and the results show
considerable improvement in the total test length over existing methods.