September 2012 Newsletter Placing you one click away from the best new CAD research! Plain-text version at http://www.umn.edu/~tcad/newsletter/2012-09.txt REGULAR PAPERS FPGAS AND RECONFIGURABLE COMPUTING Ravishankar, C.; Anderson, J. H.; Kennings, A. FPGA Power Reduction by Guarded Evaluation Considering Logic Architecture http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269965 Guarded evaluation is a power reduction technique that involves identifying subcircuits (within a larger circuit) whose inputs can be held constant (guarded) at specific times during circuit operation, thereby reducing switching activity and lowering dynamic power. The concept is rooted in the property that under certain conditions, some signals within digital designs are not ÒobservableÓ at design outputs, making the circuitry that generates such signals a candidate for guarding. Guarded evaluation has been demonstrated successfully for application-specific integrated circuits (ASICs); in this paper, we apply the technique to field-programmable gate arrays (FPGAs). In ASICs, guarded evaluation entails adding additional hardware to the design, increasing silicon area and cost. Here, we apply the technique in a way that imposes minimal area overhead by leveraging existing unused circuitry within the FPGA. The primary challenge in guarded evaluation is in determining the specific conditions under which a subcircuit's inputs can be held constant without impacting the larger circuit's functional correctness. We propose a simple solution to this problem based on discovering gating inputs using ÒnoninvertingÓ and Òpartial noninvertingÓ paths in a circuit's AND-inverter graph representation. Experimental results show that guarded evaluation can reduce switching activity on average by as much as 32% and 25% for 6-input look-up table (6-LUT) and 4-LUT architectures, respectively. Dynamic power consumption in the FPGA interconnect is reduced on average by as much as 24% and 22% for 6-LUT and 4-LUT architectures, respectively. The impact to critical path delay ranges from 1% to 43%, depending on the guarding scenario and the desired power/delay tradeoff. LOGIC SYNTHESIS Liu, H.-Y.; Chou, Y.-C.; Lin, C.-H.; Jiang, J.-H. R. Automatic Decoder Synthesis: Methods and Case Studies http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269972 Upon receiving the output sequence streaming from a sequential encoder, a decoder reconstructs the corresponding input sequence that streamed to the encoder. Such an encoding and decoding scheme is commonly encountered in communication, cryptography, signal processing, and other applications. Given an encoder specification, decoder design can be error-prone and time consuming. Its automation may help designers improve productivity and justify encoder correctness. Though recent advances showed promising progress, there is still no complete method that decides whether a decoder exists for a finite state transition system. The quest for completely automatic decoder synthesis remains. This paper presents a complete and practical approach to automating decoder synthesis via incremental Boolean satisfiability solving and Craig interpolation. Experiments show that, for decoder-existent cases, our method synthesizes decoders effectively; for decoder-nonexistent cases, our method concludes the nonexistence instantly while prior methods may fail. Case studies are also conducted in synthesizing decoders for linear error-correcting codes. MODELING AND SIMULATION Mukherjee, P.; Fang, G. P.; Burt, R.; Li, P. Efficient Identification of Unstable Loops in Large Linear Analog Integrated Circuits http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269974 Stability analysis is one of the key challenges in analog circuit design. As feature sizes continue to shrink and the effect of parasitics becomes more dominant, we are forced to deal with stability analysis of increasingly complex multiloop structures with potentially hundreds of loopsÑa task that can no longer be dealt with using traditional methods. An automated stability checker tool that detects sources of potential ringing behavior within a reasonable turnaround time has thus been made necessary. Such a tool would not just help in debug but could also serve as a postlayout validation tool. We thus present an efficient loop finder algorithm to identify sources of ringing in large linear analog circuits. At the heart of our automated stability checker are two newly developed computationally efficient algorithmsÑthe first to detect all poles within a given region of interest with a high degree of confidence and the second to extract second-order approximations of node impedance transfer functions given these pole locations. In this paper, we discuss these algorithms in detail, propose various optimization heuristics to further speed up the pole discovery algorithm, and then go on to develop a parallel implementation of both these underlying algorithms. It is demonstrated that these approaches together allow us to outperform the original loop finder algorithm based on direct eigen methods by two to four orders of magnitude and thus enable stability analysis of even larger extracted industrial designs than was previously possible while providing reasonable turnaround time. Brambilla, A.; Gruosso, G.; Gajani, G. S. MTFS: Mixed TimeÐFrequency Method for the Steady-State Analysis of Almost-Periodic Nonlinear Circuits http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269967 Periodic circuits driven by multitone signals are still a challenging simulation problem despite several numerical methods being presented in the literature. In this paper, a mixed timeÐfrequency method for the solution of this problem and suitable for both autonomous and nonautonomous circuits is presented. The method is based on an extension of the envelope following method, which allows us to reduce the number of unknowns involved in the steady-state problem with respect to previous mixed timeÐfrequency approaches, and a suitable reformulation of the periodicity constraint that allows us to obtain a significant acceleration in the determination of the solution by reducing the time interval along which the envelope analysis must be performed. The method is first presented for nonautonomous circuits and then extended to autonomous ones. PHYSICAL DESIGN Ma, Q.; Wong, M. D. F. NP-Completeness and an Approximation Algorithm for Rectangle Escape Problem With Application to PCB Routing http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269973 In this paper, we introduce and study the rectangle escape problem (REP), which is motivated by printed circuit board (PCB) bus escape routing. Given a rectangular region R and a set S of rectangles within R, the REP is to choose a direction for each rectangle to escape to the boundary of R, such that the resultant maximum density over R is minimized. We prove that the REP is NP-complete, and show that it can be formulated as an integer linear programming (ILP). A provably good approximation algorithm for the REP is developed by applying linear programming (LP) relaxation and a special rounding technique to the ILP. In addition, an iterative refinement procedure is proposed as a postprocessing step to further improve the results. Our approximation algorithm is also shown to work for more general versions of REP: weighted REP and simultaneous REP. Our approach is tested on a set of industrial PCB bus escape routing problems. Experimental results show that the optimal solution can be obtained within several seconds for each of the test cases. Hsu, M.-K.; Chang, Y.-W. Unified Analytical Global Placement for Large-Scale Mixed-Size Circuit Designs http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269969 A modern chip often contains large numbers of predesigned macros (e.g., embedded memories, IP blocks) and standard cells, with very different sizes. The fast-growing design complexity with large-scale mixed-size macros and standard cells has caused significant challenges to modern circuit placement. Analytical algorithms have been shown to be most effective for standard-cell placement, but the problems with the rotation and legalization of large macros impose intrinsic limitations for analytical placement. Consequently, most recent works on mixed-size placement resort to combinatorial macro placement. Instead, this paper presents the first attempt to resolve the intrinsic problems with a unified analytical approach. Unlike traditional analytical placement that uses only wire and density forces to optimize the positions of circuit components, we present a new force, the rotation force, to handle macro orientation for analytical mixed-size placement. The rotation force tries to rotate each macro to its desired orientation based on the wire connections on this macro. A cross potential model is also proposed to increase the rotation freedom during placement. The final orientation of each macro with legalization consideration is then determined by mathematical programming. A macro flipping force is also proposed to determine the flipping orientation of each macro at the end of global placement. Compared with start-of-the-art mixed-size placement approaches (such as FLOP, CG, and MP-tree), our approach achieves the best average wirelength efficiently. Ghaida, R. S.; Gupta, P. DRE: A Framework for Early Co-Evaluation of Design Rules, Technology Choices, and Layout Methodologies http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269964 Design rules have been the primary contract between technology developers and designers and are likely to remain so to preserve abstractions and productivity. While current approaches for defining design rules are largely unsystematic and empirical in nature, this paper offers a novel framework for early and systematic evaluation of design rules and layout styles in terms of major layout characteristics of area, manufacturability, and variability. The framework essentially creates a virtual standard-cell library and performs the evaluation based on the virtual layouts. Due to the focus on the exploration of rules at an early stage of technology development, we use first-order models of variability and manufacturability (instead of relying on accurate simulation) and layout topology/congestion- based area estimates (instead of explicit and slow layout generation). Such a framework can be used to co-evaluate and cooptimize design rules, patterning technologies, layout methodologies, and library architectures. Shih, X.-W.; Chang, Y.-W. Fast Timing-Model Independent Buffered Clock-Tree Synthesis http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269968 In high-performance synchronous chip design, a buffered clock tree with small clock skew is essential for improving clocking speed. Due to the insufficient accuracy of timing models for modern chip design, embedding simulation into a clock-tree synthesis flow becomes inevitable. Consequently, the running time for clock-tree synthesis becomes prohibitively huge as the complexity of chip designs grows rapidly. To construct a buffered clock tree efficiently, we propose an efficient timing-model independent approach to perform skew minimization by structural optimization. To achieve the goal, a novel clock-tree structure, called symmetrical structure, is presented. At each level of a symmetrical clock tree, the number of branches, the wirelength, and the inserted buffers are almost the same. It is natural that the clock skew could be minimized if the configurations of all paths from the clock source to sinks are similar. By symmetrically constructing a clock tree, the clock skew can be minimized without referring to simulation information. Experimental results show that our approach can not only efficiently construct a buffered clock tree but also effectively minimize clock skew with marginal wiring overheads. Based on a set of commonly used IBM benchmarks, e.g., a state-of-the-art work without (with) ngspice simulation results in averagely 10.04X (3.44X) clock skew and requires 163X (61906X) running time over our approach. TEST Yu, X.; Blanton, R. D. Diagnosis-Assisted Adaptive Test http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269977 This paper describes a method for improving the test quality of digital circuits on a per-design basis by: 1) monitoring the defect behaviors that occur through volume diagnosis; and 2) changing the test patterns to match the identified behaviors. To characterize the behavior of a defect (i.e., the conditions when a defect is activated), physically-aware diagnosis is employed to extract the set of signal lines relevant to defect activation. Then, based on the set of signal lines derived, the defect is attributed to one of several behavior categories. Our defect level model uses the behavior-attribution results of the current failing population to guide test-set customization to minimize defect level for a given constraint on test costs, or alternatively, ensure that defect level does not exceed some predetermined threshold. Circuit-level simulation involving various types of defects shows that defect level can be reduced by 30% using this method. Simulation experiment on actual chips also demonstrates quality improvement. Yang, J.-S.; Touba, N. A. X-Canceling MISR Architectures for Output Response Compaction With Unknown Values http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269976 In this paper, an X-tolerant multiple-input signature register (MISR) compaction methodology that compacts output responses containing unknown X values is described. Each bit of the MISR signature is expressed as a linear combination in terms of Xs by symbolic simulation. Linearly dependent combinations of the signature bits are identified with Gaussian elimination and xored to remove X values and yield deterministic values. Two X-canceling MISR architectures are proposed and analyzed with industrial designs. This paper also shows the correlation between the estimated result based on idealized modeling and the actual data for real circuits for error coverage, hardware overhead, and other metrics. Experimental results indicate that high error coverage can be achieved with X-canceling MISR configurations and it highly correlates with actual results. Pomeranz, I. Multicycle Tests With Constant Primary Input Vectors for Increased Fault Coverage http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269975 Test generation procedures for n-detection test sets improve the quality of a test set by adding tests that increase the numbers of detections of target faults. A different approach to n-detection test generation increases the numbers of detections of target faults within the bounds of the number of tests of a single-detection test set. Multicycle tests provide the flexibility of improving the quality of a test set by increasing the number of clock cycles in each test, without increasing the number of tests. Improved test quality is thus achieved with limited increases in test application time and test data volume due to the larger numbers of clock cycles in each test. This paper describes a procedure that starts from a compact one-detection single-cycle test set for single stuck-at faults and produces a multicycle test set with the same number of tests, but increased numbers of clock cycles and improved test quality. The procedure uses only one-detection fault simulation of single stuck-at faults. A similar procedure is applied starting from a two-cycle test set and considering transition faults. The procedures produce tests with constant primary input vectors to accommodate tester limitations. VERIFICATION Banerjee, A. Verifying Coalitions in 3-Party Systems http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269966 Multiplayer games are played by a set of agents, where, at each round, all players give their moves, and the combination of their moves defines the successor states for the game. In such games, the players pursue certain goals with their moves and in that pursuit, they can form coalitions to fulfill a common objective. In this paper, we adopt this model of multiplayer coalition games in the context of 3-party systems, consisting of a module, its environment, and a controller. The winning objective of our game is given as a specification in linear temporal logic and the module and controller together attempt to satisfy the specification for all possible strategies of the environment. In this paper, we analyze the problem for various degrees of observability of the moduleÐcontroller pair, and provide quantified Boolean formula-based methods for investigating the existence of a winning coalition strategy. The coalition strategy can only be fulfilled on carefully implementing both the module and the controller such that they can cooperate. We also consider the problem where the implementations of the module and the controller are available and we present a dynamic simulation-based approach for verifying whether these implementations conform to the coalition requirements. In case, they do not, we provide algorithms for designing the winning strategy for an intelligent environment to drive the coalition to a refutation. SHORT PAPERS Ghosh, J.; Mukhopadhyay, S.; Patra, A.; Culpepper, B.; Mei, T. Estimation of dc Performance of a Lateral Power MOSFET Using Distributed Cell Model http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269971 This paper presents a technique to study and estimate the dc performance of lateral power MOSFET switches used in on-chip dcÐdc converters. The dc performance is characterized by the on-resistance and current distribution profile in the switch layout. In the proposed approach, a netlist is generated that consists of the parasitic resistances extracted from the metal interconnects along with the MOS device fingers present in the layout. This approach is modular and exploits repetitive patterns of power MOSFET layouts to expedite the extraction process. The extracted resistance values are computed from the geometrical and technological parameters of the metal polygons without the requirement of solving complex electromagnetic (EM) field equations. Comparison of results with an industry standard EM solver tool as well as experimental measurements amply demonstrates the computational efficiency and accuracy of the approach. Filiol, H.; O'Connor, I.; Morche, D. Analog IC Variability Bound Estimation Using the CornishÐFisher Expansion http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6269970 In nanoscale integrated circuit technologies, process parameter fluctuations gain increasingly in importance. Efficient methods are thus required during the design phase to evaluate the resulting variability. In this letter, we propose a new method to estimate the variation bounds of analog circuit performance. This method combines design of experiment techniques with the CornishÐFisher expansion: process parameter variations are first mapped to circuit performance metrics by a quadratic model, and then an analytical approximation of the performance distribution's quantiles enables the enclosure of the performance variations. The proposed method demonstrates a better accuracy/efficiency ratio than Monte-Carlo-based methods.