September 2012 Newsletter 
Placing you one click away from the best new CAD research!
Plain-text version at http://www.umn.edu/~tcad/newsletter/2012-09.txt 

REGULAR PAPERS

FPGAS AND RECONFIGURABLE COMPUTING

Ravishankar, C.; Anderson, J. H.; Kennings, A. 
FPGA Power Reduction by Guarded Evaluation Considering Logic Architecture
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269965
Guarded evaluation is a power reduction technique that involves identifying
subcircuits (within a larger circuit) whose inputs can be held constant
(guarded) at specific times during circuit operation, thereby reducing
switching activity and lowering dynamic power. The concept is rooted in the
property that under certain conditions, some signals within digital designs are
not ÒobservableÓ at design outputs, making the circuitry that generates such
signals a candidate for guarding. Guarded evaluation has been demonstrated
successfully for application-specific integrated circuits (ASICs); in this
paper, we apply the technique to field-programmable gate arrays (FPGAs). In
ASICs, guarded evaluation entails adding additional hardware to the design,
increasing silicon area and cost. Here, we apply the technique in a way that
imposes minimal area overhead by leveraging existing unused circuitry within
the FPGA. The primary challenge in guarded evaluation is in determining the
specific conditions under which a subcircuit's inputs can be held constant
without impacting the larger circuit's functional correctness. We propose a
simple solution to this problem based on discovering gating inputs using
ÒnoninvertingÓ and Òpartial noninvertingÓ paths in a circuit's AND-inverter
graph representation. Experimental results show that guarded evaluation can
reduce switching activity on average by as much as 32% and 25% for 6-input
look-up table (6-LUT) and 4-LUT architectures, respectively. Dynamic power
consumption in the FPGA interconnect is reduced on average by as much as 24%
and 22% for 6-LUT and 4-LUT architectures, respectively. The impact to critical
path delay ranges from 1% to 43%, depending on the guarding scenario and the
desired power/delay tradeoff.

LOGIC SYNTHESIS

Liu, H.-Y.; Chou, Y.-C.; Lin, C.-H.; Jiang, J.-H. R. 
Automatic Decoder Synthesis: Methods and Case Studies
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269972
Upon receiving the output sequence streaming from a sequential encoder, a
decoder reconstructs the corresponding input sequence that streamed to the
encoder. Such an encoding and decoding scheme is commonly encountered in
communication, cryptography, signal processing, and other applications. Given
an encoder specification, decoder design can be error-prone and time consuming.
Its automation may help designers improve productivity and justify encoder
correctness. Though recent advances showed promising progress, there is still
no complete method that decides whether a decoder exists for a finite state
transition system. The quest for completely automatic decoder synthesis
remains. This paper presents a complete and practical approach to automating
decoder synthesis via incremental Boolean satisfiability solving and Craig
interpolation. Experiments show that, for decoder-existent cases, our method
synthesizes decoders effectively; for decoder-nonexistent cases, our method
concludes the nonexistence instantly while prior methods may fail. Case studies
are also conducted in synthesizing decoders for linear error-correcting codes.

MODELING AND SIMULATION

Mukherjee, P.; Fang, G. P.; Burt, R.; Li, P. 
Efficient Identification of Unstable Loops in Large Linear Analog Integrated Circuits
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269974
Stability analysis is one of the key challenges in analog circuit design. As
feature sizes continue to shrink and the effect of parasitics becomes more
dominant, we are forced to deal with stability analysis of increasingly complex
multiloop structures with potentially hundreds of loopsÑa task that can no
longer be dealt with using traditional methods. An automated stability checker
tool that detects sources of potential ringing behavior within a reasonable
turnaround time has thus been made necessary. Such a tool would not just help
in debug but could also serve as a postlayout validation tool. We thus present
an efficient loop finder algorithm to identify sources of ringing in large
linear analog circuits. At the heart of our automated stability checker are two
newly developed computationally efficient algorithmsÑthe first to detect all
poles within a given region of interest with a high degree of confidence and
the second to extract second-order approximations of node impedance transfer
functions given these pole locations. In this paper, we discuss these
algorithms in detail, propose various optimization heuristics to further speed
up the pole discovery algorithm, and then go on to develop a parallel
implementation of both these underlying algorithms. It is demonstrated that
these approaches together allow us to outperform the original loop finder
algorithm based on direct eigen methods by two to four orders of magnitude and
thus enable stability analysis of even larger extracted industrial designs than
was previously possible while providing reasonable turnaround time.

Brambilla, A.; Gruosso, G.; Gajani, G. S. 
MTFS: Mixed TimeÐFrequency Method for the Steady-State Analysis of Almost-Periodic Nonlinear Circuits
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269967
Periodic circuits driven by multitone signals are still a challenging
simulation problem despite several numerical methods being presented in the
literature. In this paper, a mixed timeÐfrequency method for the solution of
this problem and suitable for both autonomous and nonautonomous circuits is
presented. The method is based on an extension of the envelope following
method, which allows us to reduce the number of unknowns involved in the
steady-state problem with respect to previous mixed timeÐfrequency approaches,
and a suitable reformulation of the periodicity constraint that allows us to
obtain a significant acceleration in the determination of the solution by
reducing the time interval along which the envelope analysis must be performed.
The method is first presented for nonautonomous circuits and then extended to
autonomous ones.

PHYSICAL DESIGN

Ma, Q.; Wong, M. D. F. 
NP-Completeness and an Approximation Algorithm for Rectangle Escape Problem With Application to PCB 
Routing
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269973
In this paper, we introduce and study the rectangle escape problem (REP), which
is motivated by printed circuit board (PCB) bus escape routing. Given a
rectangular region R and a set S of rectangles within R, the REP is to
choose a direction for each rectangle to escape to the boundary of R, such
that the resultant maximum density over R is minimized. We prove that the REP
is NP-complete, and show that it can be formulated as an integer linear
programming (ILP). A provably good approximation algorithm for the REP is
developed by applying linear programming (LP) relaxation and a special rounding
technique to the ILP. In addition, an iterative refinement procedure is
proposed as a postprocessing step to further improve the results. Our
approximation algorithm is also shown to work for more general versions of REP:
weighted REP and simultaneous REP. Our approach is tested on a set of
industrial PCB bus escape routing problems. Experimental results show that the
optimal solution can be obtained within several seconds for each of the test
cases.

Hsu, M.-K.; Chang, Y.-W. 
Unified Analytical Global Placement for Large-Scale Mixed-Size Circuit Designs
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269969
A modern chip often contains large numbers of predesigned macros (e.g.,
embedded memories, IP blocks) and standard cells, with very different sizes.
The fast-growing design complexity with large-scale mixed-size macros and
standard cells has caused significant challenges to modern circuit placement.
Analytical algorithms have been shown to be most effective for standard-cell
placement, but the problems with the rotation and legalization of large macros
impose intrinsic limitations for analytical placement. Consequently, most
recent works on mixed-size placement resort to combinatorial macro placement.
Instead, this paper presents the first attempt to resolve the intrinsic
problems with a unified analytical approach. Unlike traditional analytical
placement that uses only wire and density forces to optimize the positions of
circuit components, we present a new force, the rotation force, to handle macro
orientation for analytical mixed-size placement. The rotation force tries to
rotate each macro to its desired orientation based on the wire connections on
this macro. A cross potential model is also proposed to increase the rotation
freedom during placement. The final orientation of each macro with legalization
consideration is then determined by mathematical programming. A macro flipping
force is also proposed to determine the flipping orientation of each macro at
the end of global placement. Compared with start-of-the-art mixed-size
placement approaches (such as FLOP, CG, and MP-tree), our approach achieves the
best average wirelength efficiently.

Ghaida, R. S.; Gupta, P. 
DRE: A Framework for Early Co-Evaluation of Design Rules, Technology Choices, and Layout 
Methodologies
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269964
Design rules have been the primary contract between technology developers and
designers and are likely to remain so to preserve abstractions and
productivity. While current approaches for defining design rules are largely
unsystematic and empirical in nature, this paper offers a novel framework for
early and systematic evaluation of design rules and layout styles in terms of
major layout characteristics of area, manufacturability, and variability. The
framework essentially creates a virtual standard-cell library and performs the
evaluation based on the virtual layouts.  Due to the focus on the exploration
of rules at an early stage of technology development, we use first-order models
of variability and manufacturability (instead of relying on accurate
simulation) and layout topology/congestion- based area estimates (instead of
explicit and slow layout generation). Such a framework can be used to
co-evaluate and cooptimize design rules, patterning technologies, layout
methodologies, and library architectures.

Shih, X.-W.; Chang, Y.-W. 
Fast Timing-Model Independent Buffered Clock-Tree Synthesis
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269968
In high-performance synchronous chip design, a buffered clock tree with small
clock skew is essential for improving clocking speed. Due to the insufficient
accuracy of timing models for modern chip design, embedding simulation into a
clock-tree synthesis flow becomes inevitable. Consequently, the running time
for clock-tree synthesis becomes prohibitively huge as the complexity of chip
designs grows rapidly. To construct a buffered clock tree efficiently, we
propose an efficient timing-model independent approach to perform skew
minimization by structural optimization. To achieve the goal, a novel
clock-tree structure, called symmetrical structure, is presented. At each level
of a symmetrical clock tree, the number of branches, the wirelength, and the
inserted buffers are almost the same. It is natural that the clock skew could
be minimized if the configurations of all paths from the clock source to sinks
are similar. By symmetrically constructing a clock tree, the clock skew can be
minimized without referring to simulation information. Experimental results
show that our approach can not only efficiently construct a buffered clock tree
but also effectively minimize clock skew with marginal wiring overheads. Based
on a set of commonly used IBM benchmarks, e.g., a state-of-the-art work without
(with) ngspice simulation results in averagely 10.04X (3.44X) clock skew and
requires 163X (61906X) running time over our approach.

TEST

Yu, X.; Blanton, R. D. 
Diagnosis-Assisted Adaptive Test
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269977
This paper describes a method for improving the test quality of digital
circuits on a per-design basis by: 1) monitoring the defect behaviors that
occur through volume diagnosis; and 2) changing the test patterns to match the
identified behaviors. To characterize the behavior of a defect (i.e., the
conditions when a defect is activated), physically-aware diagnosis is employed
to extract the set of signal lines relevant to defect activation. Then, based
on the set of signal lines derived, the defect is attributed to one of several
behavior categories. Our defect level model uses the behavior-attribution
results of the current failing population to guide test-set customization to
minimize defect level for a given constraint on test costs, or alternatively,
ensure that defect level does not exceed some predetermined threshold.
Circuit-level simulation involving various types of defects shows that defect
level can be reduced by 30% using this method. Simulation experiment on actual
chips also demonstrates quality improvement.

Yang, J.-S.; Touba, N. A. 
X-Canceling MISR Architectures for Output Response Compaction With Unknown Values
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269976
In this paper, an X-tolerant multiple-input signature register (MISR)
compaction methodology that compacts output responses containing unknown X
values is described. Each bit of the MISR signature is expressed as a linear
combination in terms of Xs by symbolic simulation. Linearly dependent
combinations of the signature bits are identified with Gaussian elimination and
xored to remove X values and yield deterministic values. Two X-canceling
MISR architectures are proposed and analyzed with industrial designs. This
paper also shows the correlation between the estimated result based on
idealized modeling and the actual data for real circuits for error coverage,
hardware overhead, and other metrics. Experimental results indicate that high
error coverage can be achieved with X-canceling MISR configurations and it
highly correlates with actual results.

Pomeranz, I. 
Multicycle Tests With Constant Primary Input Vectors for Increased Fault Coverage
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269975
Test generation procedures for n-detection test sets improve the quality of
a test set by adding tests that increase the numbers of detections of target
faults. A different approach to n-detection test generation increases the
numbers of detections of target faults within the bounds of the number of tests
of a single-detection test set.  Multicycle tests provide the flexibility of
improving the quality of a test set by increasing the number of clock cycles in
each test, without increasing the number of tests. Improved test quality is
thus achieved with limited increases in test application time and test data
volume due to the larger numbers of clock cycles in each test. This paper
describes a procedure that starts from a compact one-detection single-cycle
test set for single stuck-at faults and produces a multicycle test set with the
same number of tests, but increased numbers of clock cycles and improved test
quality.  The procedure uses only one-detection fault simulation of single
stuck-at faults. A similar procedure is applied starting from a two-cycle test
set and considering transition faults. The procedures produce tests with
constant primary input vectors to accommodate tester limitations.


VERIFICATION

Banerjee, A. 
Verifying Coalitions in 3-Party Systems
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269966
Multiplayer games are played by a set of agents, where, at each round, all
players give their moves, and the combination of their moves defines the
successor states for the game. In such games, the players pursue certain goals
with their moves and in that pursuit, they can form coalitions to fulfill a
common objective. In this paper, we adopt this model of multiplayer coalition
games in the context of 3-party systems, consisting of a module, its
environment, and a controller. The winning objective of our game is given as a
specification in linear temporal logic and the module and controller together
attempt to satisfy the specification for all possible strategies of the
environment. In this paper, we analyze the problem for various degrees of
observability of the moduleÐcontroller pair, and provide quantified Boolean
formula-based methods for investigating the existence of a winning coalition
strategy. The coalition strategy can only be fulfilled on carefully
implementing both the module and the controller such that they can cooperate.
We also consider the problem where the implementations of the module and the
controller are available and we present a dynamic simulation-based approach for
verifying whether these implementations conform to the coalition requirements.
In case, they do not, we provide algorithms for designing the winning strategy
for an intelligent environment to drive the coalition to a refutation.

SHORT PAPERS

Ghosh, J.; Mukhopadhyay, S.; Patra, A.; Culpepper, B.; Mei, T.
Estimation of dc Performance of a Lateral Power MOSFET Using Distributed Cell Model
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269971
This paper presents a technique to study and estimate the dc performance of
lateral power MOSFET switches used in on-chip dcÐdc converters. The dc
performance is characterized by the on-resistance and current distribution
profile in the switch layout. In the proposed approach, a netlist is generated
that consists of the parasitic resistances extracted from the metal
interconnects along with the MOS device fingers present in the layout. This
approach is modular and exploits repetitive patterns of power MOSFET layouts to
expedite the extraction process. The extracted resistance values are computed
from the geometrical and technological parameters of the metal polygons without
the requirement of solving complex electromagnetic (EM) field equations.
Comparison of results with an industry standard EM solver tool as well as
experimental measurements amply demonstrates the computational efficiency and
accuracy of the approach.


Filiol, H.; O'Connor, I.; Morche, D.
Analog IC Variability Bound Estimation Using the CornishÐFisher Expansion
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269970
In nanoscale integrated circuit technologies, process parameter fluctuations
gain increasingly in importance.  Efficient methods are thus required during
the design phase to evaluate the resulting variability. In this letter, we
propose a new method to estimate the variation bounds of analog circuit
performance. This method combines design of experiment techniques with the
CornishÐFisher expansion: process parameter variations are first mapped to
circuit performance metrics by a quadratic model, and then an analytical
approximation of the performance distribution's quantiles enables the enclosure
of the performance variations. The proposed method demonstrates a better
accuracy/efficiency ratio than Monte-Carlo-based methods.