July 2013 Newsletter Placing you one click away from the best new CAD research! REGULAR PAPERS ANALOG, MIXED-SIGNAL, AND RF CIRCUITS Yin, L. ; Deng, Y. ; Li, P. Simulation-Assisted Formal Verification of Nonlinear Mixed-Signal Circuits With Bayesian Inference Guidance http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6532381 The pressing need for the verification of analog and mixed-signal (AMS) designs is driven by increased design complexity and the integration of such circuits into SoCs. However, verification of AMS circuits remains a significant challenge. This paper proposes a simulation-assisted formal verification methodology that leverages SMT-based satisfiability techniques to tackle the challenges arising from the inherent analog and/or hybrid nature of AMS systems. Although state-of-the-art SMT solvers, in the worst-case scenario, still have exponential complexity in the number of constraints, the main focus of this paper is to first formally formulate the verification task into an SMT problem, then accelerate the verification by using simulation assistance. To verify the nonlinear dynamics, randomly sampled simulations are first applied to quickly explore the reachable state space, and then a nonlinear SMT solver is invoked to ensure the conservativeness. To achieve optimal efficiency, the tradeoff between the runtime costs of simulation and SMT solving is analyzed by means of a Bayesian inference-based technique that dynamically learns from the simulation history. This paper demonstrates the feasibility and efficacy of the proposed methodology on conservative verification of dynamic properties of nonlinear AMS circuits. Lin, M.P.-H. ; He, Y.-T. ; Hsiao, V.W.-H. ; Chang, R.-G. ; Lee, S.-Y. Common-Centroid Capacitor Layout Generation Considering Device Matching and Parasitic Minimization http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6532364 In analog layout design, the accuracy of capacitance ratios correlates closely with both the matching properties among the ratioed capacitors and the induced parasitics due to interconnecting wires. However, most of the previous works only emphasized the matching properties of a common-centroid placement, but ignored the induced parasitics after it is routed. This paper addresses the parasitic issue in addition to device matching during common-centroid capacitor layout generation. To effectively minimize the routing-induced parasitics, a novel common-centroid placement style, distributed connected unit capacitors, is presented. Based on the placement style, the ratioed capacitor layout generation flow and algorithms are proposed to simultaneously optimize the matching properties of a common-centroid placement and minimize the induced parasitics. Experimental results show that the proposed approach can greatly reduce area, wirelength, and routing-induced parasitics, and guarantee the best matching quality after routing. EMBEDDED SYSTEMS Xie, Q. ; Wang, Y. ; Kim, Y. ; Pedram, M. ; Chang, N. Charge Allocation in Hybrid Electrical Energy Storage Systems http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6532424 A hybrid electrical energy storage (HEES) system consists of multiple banks of heterogeneous electrical energy storage (EES) elements placed between a power source and some load devices and providing charge storage and retrieval functions. For an HEES system to perform its desired functions of 1) reducing electricity costs by storing electricity obtained from the power grid at off-peak times when its price is lower, for use at peak times instead of electricity that must be bought then at higher prices, and 2) alleviating problems, such as excessive power fluctuation and undependable power supply, which are associated with the use of large amounts of renewable energy on the grid, appropriate charge management policies must be developed in order to efficiently store and retrieve electrical energy while attaining performance metrics that are close to the respective best values across the constituent EES banks in the HEES system. This paper is the first to formally describe the global charge allocation problem in HEES systems, namely, distributing a specified level of incoming power to a subset of destination EES banks so that maximum charge allocation efficiency is achieved. The problem is formulated as a mixed integer nonlinear program with the objective function set to the global charge allocation efficiency and the constraints capturing key requirements and features of the system such as the energy conservation law, power conversion losses in the chargers, the rate capacity, and self-discharge effects in the EES elements. A rigorous algorithm is provided to obtain near-optimal charge allocation efficiency under a daily charge allocation schedule. A photovoltaic array is used as an example of the power source for the charge allocation process and a heuristic is provided to predict the solar radiation level with a high accuracy. Simulation results using this photovoltaic cell array and a representative HEES system demonstrate up to 25% gain in the charge allocation efficiency by employing the proposed algorithm. EMERGING TECHNOLOGIES Shin, D. ; Kim, Y. ; Chang, N. ; Pedram, M. Dynamic Driver Supply Voltage Scaling for Organic Light Emitting Diode Displays http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6532434 Organic light emitting diode (OLED) display is a self-illuminating device that is supposed to be more power efficient than liquid crystal display (LCD). However, OLED display panels consume as much power as LCD panels due to total internal reflection. As the power consumption of the OLED panel depends on the pixel colors, most of the earlier power saving methods alter the pixel colors. In practice, such OLED power saving techniques can hardly accommodate photo viewers and movie players. This paper introduces the first OLED power saving technique that dynamically changes the supply voltage of the panel. Reduced supply voltage results in both power saving and decreased pixel luminance, but model-based color correction restores the decreased luminance with minimum color distortion. This technique is similar to dynamic backlight scaling of LCDs but is based on the unique characteristics of the OLED drivers. We provide an online color compensation algorithm using the luminance histogram. Luminance quantization in the histogram also achieves resource minimization. We develop a prototype and demonstrate the proposed OLED dynamic voltage scaling (DVS). Experimental result shows that the proposed OLED DVS saves up to 74.7% of the display power for the still images and up to 35.9% for movie clips. MODELING AND SIMULATION Mohan, V. ; Bunker, T. ; Grupp, L. ; Gurumurthi, S. ; Stan, M.R. ; Swanson, S. Modeling Power Consumption of NAND Flash Memories Using FlashPower http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6532423 Flash is the most popular solid-state memory technology used today. A range of consumer electronics products, such as cell-phones and music players, use flash memory for storage and flash memory is increasingly displacing hard disk drives as the primary storage device in laptops, desktops, and servers. There is a rich microarchitectural design space for flash memory, and there are several architectural options for incorporating flash into the memory hierarchy. Exploring this design space requires detailed insights into the power characteristics of flash memory. In this paper, we present FlashPower, a detailed power model for the two most popular variants of NAND flash, namely, the single-level cell (SLC) and 2-bit Multi-Level Cell (MLC) based flash memory chips. FlashPower is built on top of CACTI, a widely used tool in the architecture community for studying various memory organizations. FlashPower takes several parameters like the device technology, microarchitectural layout, bias voltages and workload parameters as input to estimate the power consumption of a flash chip during its various operating modes. We validate FlashPower against chip power measurements from several different manufacturers and show that our results are comparable to the actual chip measurements. We illustrate the versatility of the tool in a design space exploration of power optimal flash memory array configurations. Xu, C. ; Kolluri, S.K. ; Endo, K. ; Banerjee, K. Analytical Thermal Model for Self-Heating in Advanced FinFET Devices With Implications for Design and Reliability http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6532366 A rigorous analytical thermal model has been formulated for the analysis of self-heating effects in FinFETs, under both steady-state and transient stress conditions. 3-D self-consistent electrothermal simulations, tuned with experimentally measured electrical characteristics, were used to understand the nature of self-heating in FinFETs and calibrate the proposed model. The accuracy of the model has been demonstrated for a wide range of multifin devices by comparing it against finite element simulations. The model has been applied to carry out a detailed sensitivity analysis of self-heating with respect to various FinFET parameters and structures, which are critical for improving circuit performance and electrical overstress/electrostatic discharge (ESD) reliability. The transient model has been used to estimate the thermal time constants of these devices and predict the sensitivity of power-to-failure to various device parameters, for both long and short pulse ESD situations. Suitable modifications to the model are also proposed for evaluating the thermal characteristics of production level FinFET (or Tri-gate FET) structures involving metal-gates, body-tied bulk FinFETs, and trench contacts. Xiong, X. ; Wang, J. Verifying RLC Power Grids With Transient Current Constraints http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6532406 Vectorless power grid verification is a powerful method that evaluates worst-case voltage noises without detailed current waveforms using optimization techniques. It is extremely challenging when considering RLC power grids because inductors are difficult to tackle and multiple time steps should be evaluated after the discretization of the system equation. In this paper, we study integrated RLC power grids with both VDD and GND networks, and introduce transient constraints to restrict the waveform of each current source for sign-off verification. We rigorously prove that the vectorless verification can be decomposed into two subproblemsÑthe well-studied power grid transient analysis problem and a linear programming (LP) problem that optimizes an affine function of currents under current constraintsÑand propose to verify the power grid by transient simulation and noise optimization. A variable reduction algorithm is further proposed to generate reduced-size LP problems with a user-specified error tolerance, so that the conservative bounds of voltage noises can be computed efficiently. Experimental results show that the proposed algorithm achieves significant speedup (e.g., up to more than 100x with 5 mV error) over the standard LP solver in solving the LP problems, and the proposed transient constraints make the noise estimations more realistic. Zhang, W. ; Balakrishnan, K. ; Li, X. ; Boning, D.S. ; Saxena, S. ; Strojwas, A. ; Rutenbar, R.A. Efficient Spatial Pattern Analysis for Variation Decomposition Via Robust Sparse Regression http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6532376 In this paper, we propose a new technique to achieve accurate decomposition of process variation by efficiently performing spatial pattern analysis. We demonstrate that the spatially correlated systematic variation can be accurately represented by the linear combination of a small number of templates. Based on this observation, an efficient sparse regression algorithm is developed to accurately extract the most adequate templates to represent spatially correlated variation. In addition, a robust sparse regression algorithm is proposed to automatically remove measurement outliers. We further develop a fast numerical algorithm that may reduce the computational time by several orders of magnitude over the traditional direct implementation. Our experimental results based on both synthetic and silicon data demonstrate that the proposed sparse regression technique can capture spatially correlated variation patterns with high accuracy and efficiency. PHYSICAL DESIGN Rahman, M. ; Tennakoon, H. ; Sechen, C. Library-Based Cell-Size Selection Using Extended Logical Effort http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6532419 Given a synthesized digital integrated circuit comprising interconnected library cells, and assuming arbitrary (continuous) sizes for the cells, experimentally, we have achieved global minimization of the total transistor sizes needed to achieve a delay goal, thus minimizing dynamic power (and reducing leakage power). An accurate table- lookup delay model was developed from the precharacterized industrial standard cell library data by making a formal extension to the concept of logical effort that enables optimization of nMOS and pMOS sizes of a cell separately. To the best of our knowledge, this is the first continuous-cell sizing technique exhibiting optimality based upon a table-lookup delay model. We then developed a new delay-bounded dynamic programming-based algorithm that maps the continuous sizes to the discrete sizes available in the standard cell library, which achieves, for the first time, active area versus delay results close to the continuous results. Parallelism was incorporated into the algorithm to enhance efficiency by leveraging multicore processors. After using state-of-the-art commercial synthesis, the application of our cell-size selection tool results in an active area (the sum of all transistor widths) reduction of 36% (on average) for large contemporary industrial designs. Lung, C.-L. ; Su, Y.-S. ; Huang, H.-H. ; Shi, Y. ; Chang, S.-C. Through-Silicon Via Fault-Tolerant Clock Networks for 3-D ICs http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6532431 Clock network synthesis is one of the most important and challenging problems in 3-D ICs. The clock signals have to be delivered by through-silicon vias (TSVs) to different tiers with minimum skew. While there are a few related works in literature, none consider the reliability of TSVs in a clock tree. Accordingly, the failure of any TSV in the clock tree yields a bad chip. The naive solution using double-TSV can alleviate the problem, but the significant area overhead renders it less practical for large designs. In this paper, we propose a novel TSV fault-tolerant unit (TFU) to provide tolerance against TSV failures. The TFU makes use of the existing 2-D redundant trees designed for prebond testing, and thus has minimum area overhead. In addition, the number of TSVs in a TFU is also adjustable to allow flexibility during clock network synthesis. Compared with the conventional double TSV technique, the 3-D clock network constructed by TFUs can achieve 58% area overhead reduction with similar yield rate on an industrial case. To the best of the authors' knowledge, this is the first work in the literature that considers the fault tolerance of a 3-D clock network. It can be easily integrated with any bottom-up clock network synthesis algorithm. SYSTEM-LEVEL DESIGN Sharifi, S. ; Krishnaswamy, D. ; Rosing, T.S. PROMETHEUS: A Proactive Method for Thermal Management of Heterogeneous MPSoCs http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6532438 In this paper, we propose PROMETHEUS, a framework for proactive temperature aware scheduling of embedded workloads on single instruction set architecture heterogeneous multiprocessor systems-on-chip. It systematically combines temperature aware task assignment, task migration, and dynamic voltage and frequency scaling. PROMETHEUS is based on our novel low overhead temperature prediction technique, Tempo. In contrast to previous work, Tempo allows accurate estimation of potential thermal effects of future scheduling decisions without requiring any runtime adaptation. It reduces the maximum prediction error by up to an order of magnitude. Using Tempo, PROMETHEUS framework provides two temperature aware scheduling techniques that proactively avoid power states leading to future thermal emergencies while matching the performance needs to the workload requirements. The first technique, TempoMP, integrates Tempo with an online multiparametric optimization method to guide decisions on task assignment, migration, and setting core power states in a temperature aware fashion. Our second scheduling technique, TemPrompt uses Tempo in a heuristic algorithm that provides comparable efficiency at lower overhead. On average, these two techniques reduce the lateness of the tasks by 2.5x and energy-lateness product (ELP) by 5x compared to the previous work. TEST Ramdas, A. ; Sinanoglu, O. Testing Chips With Spare Identical Cores http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6532345 Scalability, power efficiency, and shorter time to market due to design reuse have favored the adoption of homogeneous multicore chips with identical processing units (cores) integrated together, offering enhanced computational power. Furthermore, chips with identical cores help cope with increasing defect rates in delivering reasonable yield levels via the utilization of spare cores. In this paper, we propose a comparison-based test access mechanism (TAM) that is capable of handling spare identical cores. The proposed TAM guarantees the test of a chip through minimum bandwidth in minimum test time, while ensuring zero yield loss in the presence of spare identical cores, as its design is driven by the number of spare cores on the chip. The proposed solution also enables the identification of all the good cores in usable chips, supporting models where chips are priced based on the number of good cores. Furthermore, we provide a tradeoff analysis that enables the designers to make an informed decision regarding yield loss versus area cost. We also extend the proposed TAM by adding efficient diagnostic features, and adapting it for low-power test.