November 2012 Newsletter Placing you one click away from the best new CAD research! Plain-text version at http://www.umn.edu/~tcad/newsletter/2012-11.txt REGULAR PAPERS ANALOG, MIXED-SIGNAL, AND RF CIRCUITS Levi, T.; Lewis, N.; Tomas, J.; Renaud, S. Application of IP-Based Analog Platforms in the Design of Neuromimetic Integrated Circuits http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331647 Reuse methodologies are now widely used to design digital circuits. They are based on the concept of intellectual property (IP), or virtual block of computing, characterized by a behavioral model, synthesizable or not. The design reuse for analog integrated systems is much less natural and less standardized. This paper addresses the issue of an analog design flow based on reuse, focusing on three key issues: the formal content of the IP block, the design of a reusable analog IP, and the organization of a design flow centered on an IP library. After a conceptual overview, this paper presents the methodological principles and details examples with a tutorial intention. The objective is to guide the designer involved in the process of developing analog IPs and corresponding design flow. This method is inspired by platform-based design and adapted here on an original case study: the design of full-custom neuromimetic integrated circuits, built from specific analog computational blocks. The development of reusable IPs represents an additional effort, mainly for behavioral modeling and characterization. Nevertheless, the steps illustrated in this case study show that the extra time provides a definite advantage to future design projects. EMBEDDED SYSTEMS Fennibay, D.; Yurdakul, A.; Sen, A. A Heterogeneous Simulation and Modeling Framework for Automation Systems http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331654 Recently, new technologies have emerged in industrial automation platforms. A rapid modeling and simulation environment is required to integrate these new technologies with existing devices and platforms to reduce the design effort and time to market. System-level modeling is a popular design technique that provides early simulation, verification, and architectural exploration. However, integration of real devices with system models is quite challenging due to synchronization and hard real-time constraints in industrial automation. SystemC is the most commonly used system-level language in hardwareÐsoftware codesign. However, SystemC lacks interfaces for the integration of system (virtual) models with real (physical) devices. We introduce the hybrid channel concept to clearly define the integration interface. Hybrid channel incorporates both real-to-virtual and virtual-to-real communication functions by solving synchronization issues while satisfying the real-time constraints. We successfully demonstrated the usability of our framework in industrial systems that utilize BACNet and Ethernet. We also developed a mathematical model that correctly estimates the results of our experiments. To the best of our knowledge, this is the first framework and mathematical model for SystemC in industrial automation domain. EMERGING TECHNOLOGIES Hsieh, Y.-L.; Ho, T.-Y.; Chakrabarty, K. A Reagent-Saving Mixing Algorithm for Preparing Multiple-Target Biochemical Samples Using Digital Microfluidics http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331644 Recent advances in digital microfluidics have led to the promise of miniaturized laboratories, with the associated advantages of high sensitivity and less human-induced errors. Front-end operations such as sample preparation play a pivotal role in biochemical laboratories, and in applications in biomedical engineering and life science. For fast and high-throughput biochemical applications, preparing samples of multiple target concentrations sequentially is inefficient and time-consuming. Therefore, it is critical to concurrently prepare samples of multiple target concentrations. In addition, since reagents used in biochemical reactions are expensive, reagent-saving has become an important consideration in sample preparation. Prior work in this area does not address the problem of reagent- saving and concurrent sample preparation for multiple target concentrations. In this paper, we propose the first reagent-saving mixing algorithm for biochemical samples of multiple target concentrations. The proposed algorithm not only minimizes the consumption of reagents, but it also reduces the number of waste droplets and the sample preparation time by preparing the target concentrations concurrently. The proposed algorithm is evaluated on both real biochemical experiments and synthetic test cases to demonstrate its effectiveness and efficiency. Compared to prior work, the proposed algorithm can achieve up to 41% reduction in the number of reagent droplets and waste droplets, and up to 50% reduction in sample preparation time. MODELING AND SIMULATION Li, B.; Chen, N.; Schlichtmann, U. Statistical Timing Analysis for Latch-Controlled Circuits With Reduced Iterations and Graph Transformations http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331648 Level-sensitive latches are widely used in high-performance designs. For such circuits, efficient statistical timing analysis algorithms are needed to take increasing process variations into account. The existing methods for solving this problem are still computationally expensive and can only provide the yield at a given clock period. In this paper, we propose a method combining reduced iterations and graph transformations. The reduced iterations extract setup time constraints and identify a subgraph for the following graph transformations handling the constraints from nonpositive loops. The combined algorithms are very efficient, more than ten times faster than other existing methods, and result in a parametric minimum clock period, which, together with the hold-time constraints, can be used to compute the yield at any given clock period very easily. Jung, J.; Kim, T. Variation-Aware False Path Analysis Based on Statistical Dynamic Timing Analysis http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331645 In recent years, there has been a lot of research into statistical static timing analysis (SSTA) to compute the critical path delay of a circuit under timing variation. In order to compute the true critical path delay, however, false paths that cannot be sensitized by any input vector must be identified first. Since SSTA is unable to capture the dynamic timing behavior of a circuit, it is completely blind to false paths, and thus it overestimates the circuit timing. In this paper, we propose a new concept of timing analysis approach called statistical dynamic timing analysis (SDTA), which is able to precisely express the statistical behavior of dynamic transitions at the output of gate into a compact form and to directly evaluate and propagate the expressions throughout the circuit, by which the false paths can be cleaned effectively. In addition, to be practical, we propose a couple of techniques that enable a fast computation of the SDTA. We tested the proposed approach on ISCAS benchmarks and carry skip adders under timing variation to show its accuracy in computing the distribution of the true critical path delay of a circuit. In summary, compared to the previous approach of false path-aware statistical timing analysis, our timing analysis technique is able to reduce the accuracy error in the mean and standard deviation of true critical path delay distribution from 9.8% to 1.9% and from 29.4% to -3.4%, respectively. Xu, C.; Srivastava, N.; Suaya, R.; Banerjee, K. Fast High-Frequency Impedance Extraction of Horizontal Interconnects and Inductors in 3-D ICs With Multiple Substrates http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331642 We present a high-frequency impedance extraction method for horizontal interconnects as needed in 3-D integrated circuits (ICs), where the horizontal interconnects are sandwiched between substrate layers of possibly different electromagnetic parameters. In particular, for the first time, we develop an extension of the discrete complex images method based on a 2D, or alternatively, 3D magneto-quasi-static (MQS) vector potential Green's functions to extract analytical solutions to the series impedance (resistance and inductance) matrix elements for wire filaments. We then follow standard methods to extract the port impedance. Using the 2D approach, the series impedance per unit length of horizontal wire loops is obtained, which shows excellent accuracy (${<}{1%}$ error to Maxwell SV) and significantly improved computational cost (two orders faster than Maxwell SV). Using our 3D approach and combining the series impedance matrix from the MQS extraction engine with the capacitance matrix from an electrostatic extraction engine, we produce an electro-magneto-quasi-static impedance matrix extraction engine, which is used to extract the input impedance of a spiral inductor. In the frequency range spanning near dc to a high frequency cutoff given by four times the frequency of the maximum in the quality factor, we show that our results agree to within less than 5% and 11% deviation to the full-wave simulator HFSS, for the self and mutual loop impedance, respectively. The CPU time using our approach is $18{hbox{--}}25times$ faster than HFSS. These results provide a reasonable foundation for circuit block-level impedance extraction for interconnects and inductors in 3-D integrated systems. PHYSICAL DESIGN Mak, W.-K.; Lin, Y.-C.; Chu, C.; Wang, T.-C. Pad Assignment for Die-Stacking System-in-Package Design http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331649 Wire bonding is currently the most popular method for connecting signals between dies in system-in-package (SiP) design. Pad assignment, which assigns inter-die signals to die pads so as to facilitate wire bonding, is an important physical design problem for SiP design because the quality of a pad assignment solution affects both the cost and the performance of an SiP design. In this paper, we study a pad assignment problem, which prohibits the generation of illegal crossings and aims to minimize the total signal wirelength, for die-stacking SiP design. We first consider the two-die cases and die-stacks with a bridging die, and present a minimum-cost flow-based approach to optimally solve them in polynomial time. We then describe an approach, which uses a modified left-edge algorithm and an integer linear programming technique, for pyramid die-stacks with no bridging die. Finally, we discuss extensions of the two approaches to handle additional design constraints. Encouraging experimental results are shown to support our approaches. Ho, K.-H.; Jiang, J.-H. R.; Chang, Y.-W. TRECO: Dynamic Technology Remapping for Timing Engineering Change Orders http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331643 Due to increasing integrated circuit design complexity, engineering change orders (ECOs) have become a necessary technique to resolve late-found functional errors and/or performance deficiencies. To fix timing violations, gate sizing and buffer insertion are commonly used in postmask ECO. These techniques, however, may not be powerful enough, especially when spare cells are inserted to balance between functional and timing repair capabilities. We propose a postmask ECO technique, called TRECO, to remedy timing violations based on technology remapping, which also supports functional ECO. Unlike conventional technology mapping, TRECO performs technology mapping with respect to a limited set of spare cells and confronts dynamic changes of wiring cost incurred by selection of different spare cells. With a precomputed lookup table of representative circuit templates, TRECO iteratively performs technology remapping to restructure timing critical subcircuits until no timing violation can be further removed. Experimental results on five industrial designs show the effectiveness of TRECO in ECO timing optimization and in timing-aware functional ECO. TEST Pomeranz, I. A Metric for Identifying Detectable Path Delay Faults http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331653 Path delay faults are used for modeling small delay defects. Due to the large numbers of paths and the large numbers of undetectable path delay faults, test generation procedures for path delay faults use path selection procedures and procedures for the identification of undetectable faults to facilitate test generation. To complement these procedures, this paper describes a metric for assessing the likelihood that a path delay fault is detectable. Path selection procedures should prefer such faults in order to yield sets of target faults that are detectable even if not all the undetectable faults are identified prior to test generation. The metric is defined such that it allows all the path delay faults with the same value of the metric (the same likelihood of being detectable) to be enumerated together. The metric is computed based on the numbers of detections of transition faults under a test set for such faults, and requires $N$-detection fault simulation of transition faults for a sufficiently large value of $N$. The results of test generation for path delay faults confirm that faults with higher values of the metric are more likely to be detectable. Liu, X.; Xu, Q. On X-Variable Filling and Flipping for Capture-Power Reduction in Linear Decompressor-Based Test Compression Environment http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331656 Excessive test power consumption and growing test data volume are both serious concerns for the semiconductor industry. Various low-power X-filling techniques and test data compression schemes were developed accordingly to address the above problems. These methods, however, often exploit the very same Òdon't-careÓ bits in the test cubes to achieve different objectives and hence may contradict each other. In this paper, we propose novel techniques to reduce scan capture power in linear decompressor-based test compression environment, by employing algorithmic solutions to fill and flip X-variables supplied to the linear decompressor. Experimental results on benchmark circuits demonstrate that our proposed techniques significantly outperform existing solutions. Kavousianos, X.; Chakrabarty, K.; Jain, A.; Parekhji, R. Test Schedule Optimization for Multicore SoCs: Handling Dynamic Voltage Scaling and Multiple Voltage Islands http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331646 In order to provide high performance with low power consumption, many multicore chips employ dynamic voltage scaling and voltage islands that operate at multiple power-supply voltage levels. Effective defect screening for such chips requires test applications at different operating voltages, which leads to higher test time and test cost compared to systems-on-a-chip (SoCs), which operate at only a single voltage level. We propose test scheduling techniques to minimize the testing time for multicore chips when each core is tested at multiple voltage levels and when it is tested for state retention when the core switches between two voltage levels. The proposed techniques include exact optimization based on integer linear programming and fast heuristic methods. Experimental results for two test-case SoCs from the industry highlight the effectiveness of the proposed method. SHORT PAPERS Oh, D.; Chen, C. C. P.; Hu, Y. H. Efficient Thermal Simulation for 3-D IC With Thermal Through-Silicon Vias http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331652 A novel virtual power source (VPS) method is proposed that can significantly speedup 3-D integrated circuit (IC) thermal simulation with sparely deployed thermal TSVs leveraging Green function-based analytical spectral method. Specifically, it is shown that the impact of TSVs with nonhomogeneous thermal conductivity on thermal simulation may be modeled as VPSs located at mesh grids inside the thermal TSVs. Using this VPS model, the temperature distribution over the entire 3-D IC may be obtained by solving for the thermal distribution of a thermally homogeneous substrate in the presence of these VPSs. As such, the fast spectral method may be applied to accelerate the simulation. Significant (3Ð100 times) speedup over a baseline finite difference method implementation has been observed in preliminary simulations. Mukherjee, S.; Dasgupta, P. Assertion Aware Sampling Refinement: A Mixed-Signal Perspective http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331650 Sampling has been one of the key issues in simulation-based verification of analog and mixed signal (AMS) systems. Recent attempts toward extending assertion languages to the AMS domain has brought forward an obvious question. In what way should sampling be done to ensure that assertions are evaluated correctly? Increasing sampling granularity often comes with substantial simulation time overhead. On the other hand, interpolation of the analog signals between consecutive samples introduces inaccuracies in the signal values and, hence, in the truth of the assertions. This paper explores how temporal assertions are handled for inadequately sampled signals. We propose a three-valued semantics (true, false, and unknown) for AMS assertions to address the uncertainty caused by the inadequacy of samples. The evaluation algorithm reports the time intervals where additional samples are required to resolve the uncertainty, thereby paving the way for adaptive sampling refinement in assertion aware AMS simulation. Mukherjee, S.; Dasgupta, P. Computing Minimal Debugging Windows in Failure Traces of AMS Assertions http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331651 There has been considerable focus recently on research on developing assertion checking capability with analog and mixed-signal (AMS) simulators. Such tools must be able to detect failures of assertions in simulation traces and report the windows in which failures have been detected. Due to the dense real time semantics of AMS assertions, the task of identifying the minimal debugging window for each failure is not a trivial problem. This paper addresses the problem of computing the minimal debugging window in failure traces for AMS assertions and presents an algorithm which is linear in regards to the size of the assertion and the size of the trace. Shelar, R. S. A Fast and Near-Optimal Clustering Algorithm for Low-Power Clock Tree Synthesis http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6331655 Clocks are known to be major source of power consumption in digital circuits. In this paper, we propose a clustering algorithm for the minimization of power in a local clock tree. Given a set of sequentials and their locations, clustering is performed to determine the clock buffers that are required to synchronize the sequentials, where a cluster implies that a clock buffer drives all the sequentials in the cluster. The results produced by the algorithm are often within $1.3times$ of the lower bound and have 32% lower costs, on average, than those due to an approximation algorithm with $2.5times$ faster runtimes. Compared to competitive heuristic from a vendor tool, the results due to the algorithm on several blocks in microprocessor designs in advanced nanometer technologies show 14% reduction, on average, in clock tree power while meeting skew or slew constraints. The algorithm has been employed for clock tree synthesis for several microprocessor designs across process generations due to consistently significant clock tree power savings over the results due to competitive alternatives.