TCAD Newsletter - April 2010 Issue Placing you one click away from the best new CAD research! KEYNOTE PAPER ============= Jhaveri, T.; Rovner, V.; Liebmann, L.; Pileggi, L.; Strojwas, A. J.; Hibbeler, J. D.; "Co-Optimization of Circuits, Layout and Lithography for Predictive Technology Scaling Beyond Gratings" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433742&isnumber=5433734 Abstract: The financial backbone of the semiconductor industry is based on doubling the functional density of integrated circuits every two years at fixed wafer costs and die yields. The increasing demands for 'computational' rather than 'physical' lithography to achieve the aggressive density targets, along with the complex device-engineering solutions needed to maintain the power density objectives, have caused a rapid escalation in systematic yield limiters that threaten scaling. Specifically, the traditional contract between design and manufacturing based solely on design rules is no longer sufficient to guarantee functional silicon and instead requires a convoluted set of restrictions that force complex modifications to the already costly design flows. In this paper, we claim that a far superior result can be achieved by moving the design-to-manufacturing interface from design rules to a higher level of abstraction based on a defined set of pre-characterized layout templates. We will demonstrate how this methodology can simplify optical proximity correction and lithography processes for sub-32 nm technology nodes, along with various digital block design examples for synthesized intellectual property (IP) cores. Furthermore, with a cost-per-good-die analysis we will show that this methodology will extend economical scaling to sub-32 nm technology nodes. Regular Papers ============== Analog, Mixed-Signal, and RF Design ----------------------------------- Brambilla, A.; Gruosso, G.; Gajani, G. S.; "FSSA: Fast Steady-State Algorithm for the Analysis of Mixed Analog/Digital Circuits" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433754&isnumber=5433734 Abstract: The shooting method is largely employed to determine the steady-state working condition of both autonomous and nonautonomous circuits. In general, the conventional shooting method employs the Newton algorithm to estimate a better approximation of the steady-state working condition. The Newton algorithm requires the computation of the Jacobian matrix and this seriously limits the use of the conventional shooting method to solve medium/large scale circuits. In this paper, an approach to efficiently determine the shooting matrix is presented. It is shown that the approach is also adequate to deal with mixed analog/digital circuits. Embedded Systems ---------------- Kumar, A.; Mesman, B.; Corporaal, H.; Ha, Y.; "Iterative Probabilistic Performance Prediction for Multi-Application Multiprocessor Systems" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433755&isnumber=5433734 Abstract: Modern embedded devices are increasingly becoming multiprocessor with the need to support a large number of applications to satisfy the demands of users. Due to a huge number of possible combinations of these multiple applications, it becomes a challenge to predict their performance. This becomes even more important when applications may be dynamically started and stopped in the system. Since modern embedded systems allow users to download and add applications at run-time, a complete design-time analysis is not always possible. This paper presents a new technique to accurately predict the performance of multiple applications mapped on a multiprocessor platform. Iterative probabilistic analysis is used to estimate the time spent by tasks during their contention phase, and thereby predicting the performance of applications. The approach is scalable with the number of applications and processors in the system. As compared to earlier techniques, this approach is much faster and scalable, while still improving the accuracy. The analysis takes 300 $mu {rm s}$ on a 500 MHz processor for ten applications.Since multimedia applications are increasingly becoming more dynamic, results of a case-study with applications with varying execution times are also presented. In addition, results of a case-study with real applications executing on a field-programmable gate array multiprocessor platform are shown. Emerging Technologies --------------------- Xu, T.; Chakrabarty, K.; Pamula, V. K.; "Defect-Tolerant Design and Optimization of a Digital Microfluidic Biochip for Protein Crystallization" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433748&isnumber=5433734 Abstract: Protein crystallization is a commonly used technique for proteinanalysis and subsequent drug design. It predicts the 3-D arrangement of the constituent amino acids, which in turn indicates the specific biological function of a protein. Protein crystallization experiments are typically carried out in well-plates in the laboratory. As a result, these experiments are slow, expensive, and error-prone due to the need for repeated human intervention. Recently, droplet-based digital?microfluidics have been used for executing protein assays on a chip. Protein samples in the form of nanoliter-volume droplets are manipulated using the principle of electrowetting-on-dielectric. We present the design of a multi-well-plate microfluidic biochip for protein crystallization; this biochip can transfer protein samples, prepare candidate solutions, and carry out crystallization automatically. To reduce the manufacturing cost of such devices, we present an efficient algorithm to generate a pin-assignment plan for the proposed design. The resulting biochip enables control of a large number of on-chip electrodes using only a small number of pins. Based on the pin-constrained chip design, we present an efficient shuttle-passenger-like droplet manipulation method and test procedure to achieve high-throughput and defect-tolerant well loading. High-Level Synthesis -------------------- Kundu, S.; Lerner, S.; Gupta, R. K.; "Translation Validation of High-Level Synthesis" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433744&isnumber=5433734 Abstract: The growing complexity of systems and their implementation into silicon encourages designers to look for ways to model designs at higher levels of abstraction and then incrementally build portions of these designs - automatically or manually - from these high-level specifications. Unfortunately, this translation process itself can be buggy, which can create a mismatch between what a designer intends and what is actually implemented in the circuit. Therefore, checking if the implementation is a refinement or equivalent to its initial specification is of tremendous value. In this paper, we present an approach to automatically validate the implementation against its initial high-level specification using insights from translation validation, automated theorem proving, and relational approaches to reasoning about programs. In our experiments, we first focus on concurrent systems modeled as communicating sequential processes and show that their refinements can be validated using our approach. Next, we have applied our validation approach to a realistic scenario: a parallelizing high-level synthesis framework called Spark. We present the details of our algorithm and experimental results. Modeling -------- Liu, J.-H.; Tsai, M.-F.; Chen, L.; Chen, C. C.-P.; "Accurate and Analytical Statistical Spatial Correlation Modeling Based on Singular Value Decomposition for VLSI DFM Applications" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433751&isnumber=5433734 Abstract: With the significant advancement of statistical timing and yieldanalysis algorithms, there is a strong need for accurate and analytical spatial correlation models. In this paper, we propose a novel spatial correlation modeling method that can not only capture the general spatial correlation relationship but also can generate highly accurate and analytical models. Our method, based on singular value decomposition, can generate sequences of polynomial weighted by the singular values. Experimental results from foundry measurement data show that our proposed approach is 3x accuracy improvement over several distance based spatial correlation modeling methods. Shin, S.; Kim, K.; Kang, S.-M.; "Compact Models for Memristors Based on Charge-Flux Constitutive Relationships" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433745&isnumber=5433734 Abstract: This paper introduces compact models for memristors. The models are developed based on the fundamental constitutive relationships between charge and flux of memristors. The modeling process, with a few simple steps, is introduced. For memristors with limited resistance ranges, a simple method to find their constitutive relationships is discussed, and examples ofcompact models are shown for both current-controlled and voltage-controlled memristors. Our models satisfy all of the memristor properties such as frequency dependent hysteresis behaviors and also unique boundary assurance to simulate memristors whether they behave memristively or resistively. Our models are implementable in circuit simulators, including SPICE, Verilog-A,and Spectre. Li, X.; McAndrew, C. C.; Wu, W.; Chaudhry, S.; Victory, J.; Gildenblat, G.;"Statistical Modeling With the PSP MOSFET Model" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433756&isnumber=5433734 Abstract: PSP and the backward propagation of variance (BPV) method are used to characterize the statistical variations of metal-oxide-semiconductor field effect transistors (MOSFETs). BPV statistical modeling of NMOS and PMOS devices is, for the first time, coupled by including self-consistent modeling of ring oscillator gate delays. Parasitic capacitances are included in the analysis. The proposed technique is validated using Monte-Carlo simulations and by comparison to experimental data from two technologies. Physical Design --------------- Ma, Q.; Young, E. F. Y.; "Multivoltage Floorplan Design" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433752&isnumber=5433734 Abstract: Energy efficiency has become a very important issue to be addressed in today's system-on-a-chip (SoC) designs. One way to lower power consumption is to reduce the supply voltage. Multisupply voltage (MSV) is thus introduced to provide flexibility in controlling the power and performance tradeoff. In region-based MSV, circuits are partitioned into ?voltage islands? where each island occupies a contiguous physical space and operates at one voltage level. These tasks of island partitioning and voltage level assignment should be done simultaneously in the floorplanning process in order to take those important physical information into consideration. In this paper, we consider this core-based voltage island driven floorplanning problem including islands with power down mode, and propose a method to solve it.Given a candidate floorplan solution represented by a normalized Polish expression, we are able to obtain optimal voltage assignment and island partitioning (including islands with power down mode) simultaneously to minimizethe total power consumption. Simulated annealing is used as the basic searching engine. By using this approach, we can achieve significant power saving (up to 50%) for all datasets, without any significant increase in area and wire length. We compared our approach with the most updated previous work on the same problem, and results show that our approach is much more efficient and is able to save more power in most cases. We have also studied two other approaches to solve the same problem, a simple dynamic programming approach and a lowest possible power consumption approach. Experimental results show that ours can perform the best among these three approaches. Our floorplanner can also be extended to minimize the number of level shifters,to address a minVdd version of the problem and to simplify the power routing step by placing islands close to their corresponding power pins. Test ---- Li, K. S.-M.; "Multiple Scan Trees Synthesis for Test Time/Data and Routing Length Reduction Under Output Constraint" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433749&isnumber=5433734 Abstract: A synthesis methodology for multiple scan trees that considers output pin limitation, scan chain routing length, test application time, andtest data compression rate simultaneously is proposed in this paper. Multiple scan trees, also known as a scan forest, greatly reduce test data volume and test application time in system-on-chip testing. However, previous research on scan tree synthesis rarely considered issues such as, routing length and output port limitation, and hence created scan trees with a large number of scan output ports and excessively long routing paths. The proposedalgorithm provides a mechanism that effectively reduces test time and testdata volume, and routing length under output port constraint. As a result,very few or no output compressors are required, which significantly reduces the hardware overhead. Short Papers ============ Thakker, R. A.; Sathe, C.; Baghini, M. S.; Patil, M. B.; "A Table-Based Approach to Study the Impact of Process Variations on FinFET Circuit Performance" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433746&isnumber=5433734 Abstract: This paper presents a novel table-based approach for efficient statistical analysis of Finfield effect transistor circuits. The proposed approach uses a new scheme for interpolation of look-up tables (LUTs) with respect to process parameters. The effect of various process parameters, viz., channel length, fin width, and effective oxide thickness is studied for three circuits: buffer chain, static random access memory cell, and high-gain low-voltage op-amp. Compared to mixed-mode (device-circuit) simulation, the proposed LUT-based approach is shown to be much faster, thus making it practically a feasible and attractive option for variability analysis especially for emerging technologies where compact models are not available for circuit simulation. Rak, A.; Cserey, G.; "Macromodeling of the Memristor in SPICE" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433753&isnumber=5433734 Abstract: In this paper, we present a new simulation program with integrated circuit emphasis macromodel of the recently physically implemented memristor. This macromodel could be a powerful tool for electrical engineers to design and experiment new circuits with memristors. Our simulation results show similar behavior to the already published measurements of the physicalimplementation. Our approach provides a solution for the modeling of boundary conditions following exactly the published mathematical model of HP Labs. The functionality of our macromodel is demonstrated with computer simulations. The source code of our macromodel is provided. Cauley, S.; Balakrishnan, V.; Koh, C.-K.; "A Parallel Direct Solver for the Simulation of Large-Scale Power/Ground Networks" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433747&isnumber=5433734 Abstract: An algorithm is presented for the fast and accurate simulation of power/ground mesh structures. Our method is a direct (non-iterative) approach for simulation based upon a parallel matrix inversion algorithm. The new dimension of flexibility provided by our algorithm allows for a more accurate analysis of power/ground mesh structures using resistance, inductance, capacitance, interconnect models. Specifically, we offer a method that employs a sparse approximate inverse technique to consider more reluctance coupling terms for increased accuracy of simulation. Our algorithm shows substantial computational improvement over the best known direct and iterative numerical techniques that are applicable to these large-scale simulation problems. Kim, T.; Liu, X.; "A Functional Unit and Register Binding Algorithm for Interconnect Reduction" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433743&isnumber=5433734 Abstract: This paper describes a simultaneous register and functional unit(FU) binding algorithm in high level synthesis. Our algorithm targets the reduction of multiplexer inputs, shortening the total length of global interconnects. Specifically, our algorithm maximizes the interconnect sharing among FUs and registers by considering flow dependences, common primary inputs, and common register inputs among operations. Experimental results have shown that our scheme achieves more than 20% multiplexer input count reduction, on average, over previously proposed algorithms. Our approach deliversa 18% wirelength reduction of global interconnects with minor area overhead. Chou, H.-Z.; Chang, K.-H.; Kuo, S.-Y.; "Accurately Handle Don't-Care Conditions in High-Level Designs and Application for Reducing Initialized Registers" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5433750&isnumber=5433734 Abstract: Don't-care conditions are utilized by many synthesis tools because such conditions provide additional flexibility for logic optimization. However, most techniques only focus on the gate level because it is difficult to handle such conditions accurately at behavior and register transfer levels. This is problematic since the trend is to move toward high-level synthesis. In this paper, we propose innovative methods to handle such conditions accurately at high-level designs. In addition, we propose three novel algorithms based on our new methods to minimize the number of registers that need to be initialized, which can reduce the routing resources used by the reset signals and alleviate the routing problem. We applied our techniques to a five-stage pipelined processor and successfully reduced the number of control registers that need to be initialized by 53%, demonstrating the effectiveness of our approach.