TCAD Newsletter - July 2010 Issue Placing you one click away from the best new CAD research! Keynote Paper ============== Chakrabarty, K.; Fair, R. B.; Zeng, J., "Design Tools for Digital Microfluidic Biochips: Toward Functional Diversification and More Than Moore" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487469&isn... Abstract: Microfluidics-based biochips enable the precise control of nanoliter volumes of biochemical samples and reagents. They combine electronics with biology, and they integrate various bioassay operations, such as sample preparation, analysis, separation, and detection. Compared to conventional laboratory procedures, which are cumbersome and expensive, miniaturized biochips offer the advantages of higher sensitivity, lower cost due to smaller sample and reagent volumes, system integration, and less likelihood of human error. This paper first describes the droplet-based "digital" microfluidic technology platform and emerging applications. The physical principles underlying droplet actuation are next described. Finally, the paper presents computer-aided design tools for simulation, synthesis and chip optimization. These tools target modeling and simulation, scheduling, module placement, droplet routing, pin-constrained chip design, and testing. Regular Papers ============== Embedded Systems Lee, J.; Shrivastava, A., A Compiler-Microarchitecture Hybrid Approach to Soft Error Reduction for Register Files URL: Abstract: For embedded systems where neither energy nor reliability can be easily sacrificed, we present an energy efficient soft error protection scheme for register files (RFs). Unlike previous approaches, our method explicitly optimizes for energy efficiency and can exploit the fundamental tradeoff between reliability and energy. While even simple compiler-managed RF protection scheme can be more energy efficient than hardware schemes, this work formulates and solves further compiler opti- mization problems to significantly enhance the energy efficiency of RF protection schemes by an additional 30% on average, as demonstrated in our experiments on a number of embedded application benchmarks. Dey, S.; Sarkar, D.; Basu, A., "A Tag Machine Based Performance Evaluation Method for Job-Shop Schedules" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487465&isn... Abstract: This paper proposes a methodology for performance evaluation of schedules for job-shops modeled using tag machines. The most general tag structure for capturing dependences is shown to be inadequate for the task. A new tag structure is proposed. Comparison of the method with existing ones reveals that the proposed method has no dependence on schedule length in terms of modeling efficiency and it shares the same order of complexity with existing approaches. The proposed method, however, is shown to bear promise of applicability to other models of computation and hence to heterogeneous system models having such constituent models. Modeling and Simulation Acary, V.; Bonnefon, O.; Brogliato, B., "Time-Stepping Numerical Simulation of Switched Circuits Within the Nonsmooth Dynamical Systems Approach" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487478&isn... Abstract: The numerical integration of switching circuits is known to be a tough issue when the number of switches is large, or when sliding modes exist. Then, classical analog simulators may behave poorly, or even fail. In this paper, it is shown on two examples that the nonsmooth dynamical systems (NSDS) approach, which is made of: 1) a specific modeling of the piecewise-linear electronic devices (ideal diodes, Zener diodes, transistors); 2) the Moreau's time-stepping scheme; and 3) specific iterative one-step solvers, supersedes simulators of the simulation program with integrated circuit emphasis (SPICE) family and hybrid simulators. An academic example constructed in [Maffezzoni,IEEE Trans. CAD, vol 25, no. 11, Nov. 2006], so that the Newton-Raphson scheme does not converge, and the buck converter are used to make extensive comparisons between the NSDS method and other methods of the SPICE family and a hybrid-like method. The NSDS method, implemented in the siconos platform developed at INRIA, proves to be on these two examples much faster and more robust with respect to the model parameter variations. Janakiraman, V.; Bharadwaj, A.; Visvanathan, V., "Voltage and Temperature Aware Statistical Leakage Analysis Framework Using Artificial Neural Networks" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487472&isn... Abstract: Artificial neural networks (ANNs) have shown great promise in modeling circuit parameters for computer aided design applications. Leakage currents, which depend on process parameters, supply voltage and temperature can be modeled accurately with ANNs. However, the complex nature of the ANN model, with the standard sigmoidal activation functions, does not allow analytical expressions for its mean and variance. We propose the use of a new activation function that allows us to derive an analytical expression for the mean and a semi-analytical expression for the variance of the ANN-based leakage model. To the best of our knowledge this is the first result in this direction. Our neural network model also includes the voltage and temperature as input parameters, thereby enabling voltage and temperature aware statistical leakage analysis (SLA). All existing SLA frameworks are closely tied to the exponential polynomial leakage model and hence fail to work with sophisticated ANN models. In this paper, we also set up an SLA framework that can efficiently work with these ANN models. Results show that the cumulative distribution function of leakage current of ISCAS'85 circuits can be predicted accurately with the error in mean and standard deviation, compared to Monte Carlo-based simulations, being less than 1% and 2% respectively across a range of voltage and temperature values. Physical Design Jeong, K.; Kahng, A. B.; Park, C.-H.; Yao, H., "Dose Map and Placement Co-Optimization for Improved Timing Yield and Leakage Power" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487474&isn... Abstract: In sub-100 nm CMOS processes, delay and leakage power reduction continue to be among the most critical design concerns. We propose to exploit the recent availability of fine-grain exposure dose control in the step-and-scan tool to achieve both design-time (placement) and manufacturing-time (yield-aware dose mapping) optimizations of timing yield and leakage power. Our placement and dose map co-optimization can improve both timing yield and leakage power of a given design. We formulate the placement-aware dose map optimization as quadratic and quadratic constraint programs which are solved using efficient quadratic program solvers. In this paper, we mainly focus on the placement-aware dose map optimization problem; in Appendix, we describe a complementary but less impactful dose map-aware placement optimization based on an efficient cell swapping heuristic. Experimental results show noticeable improvements in minimum cycle time without leakage power increase, or in leakage power reduction without degradation of circuit performance. System-Level Design Beltrame, G.; Fossati, L.; Sciuto, D., "Decision-Theoretic Design Space Exploration of Multiprocessor Platforms" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487473&isn... Abstract: This paper presents an efficient technique to perform design space exploration of a multiprocessor platform that minimizes the number of simulations needed to identify a Pareto curve with metrics like energy and delay. Instead of using semi-random search algorithms (like simulated annealing, tabu search, genetic algorithms, etc.), we use the domain knowledge derived from the platform architecture to set-up the exploration as a discrete-space Markov decision process. The system walks the design space changing its parameters, performing simulations only when probabilistic information becomes insufficient for a decision. A learning algorithm updates the probabilities of decision outcomes as simulations are performed. The proposed technique has been tested with two multimedia industrial applications, namely the "ffmpeg" transcoder and the parallel "pigz" compression algorithm. Results show that the exploration can be performed with 5% of the simulations necessary for the most used algorithms (Pareto simulated annealing, nondominated sorting genetic algorithm, etc.), increasing the exploration speed by more than one order of magnitude. Brisk, P.; Verma, A. K.; Ienne, P., "An Optimal Linear-Time Algorithm for Interprocedural Register Allocation in High Level Synthesis Using SSA Form" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487467&isn... Abstract: An optimal linear-time algorithm for interprocedural register allocation in high level synthesis is presented. Historically, register allocation has been modeled as a graph coloring problem, which is nondeterministic polynomial time-complete in general; however, converting each procedure to static single assignment (SSA) form ensures a chordal interference graph, which can be colored in O(|V|+|E|) time; the interprocedural interference graph (IIG) is not guaranteed to be chordal after this transformation. An extension to SSA form is introduced which ensures that the IIG is chordal, and the conversion process does not increase its chromatic number. The resulting IIG can then be colored in linear-time. Test Alpaslan, E.; Huang, Y.; Lin, X.; Cheng, W.-T.; Dworak, J., "On Reducing Scan Shift Activity at RTL" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487476&isn... Abstract: Power dissipation in digital circuits during scan-based test is generally much higher than that during functional operation. Unfortunately, this increased test power can create hot spots that may damage the silicon, the bonding wires, and even the package. It can also cause intensive erosion of conductors-severely decreasing the reliability of a device. Finally, excessive test power may also result in extra yield loss. To address these issues, this paper first presents a detailed investigation of a benchmark circuit's switching activity during different modes of operation. Specifically, the average number of transitions in the combinational logic of a benchmark circuit during scan shift is found to be approximately 2.5 times more than the average number of transitions during the circuit's normal functional operation. A DFT-based approach for reducing circuit switching activity during scan shift is proposed. Instead of inserting additional logic at the gate level that may introduce additional delay on critical paths, the proposed method modifies the design at the register transfer level (RTL) and uses the synthesis tools to automatically deal with timing analysis and optimization. Our experiments show that significant power reduction can be achieved with very low overhead. Short Papers ============ Tannir, D.; Khazaka, R., "Computation of Intermodulation Distortion in RF Circuits Using Single-Tone Moments Analysis" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487468&isn... Abstract: Obtaining the value of the third order intercept point using traditional simulation techniques typically requires a nonlinear steady state analysis with multitone inputs, which is very computationally expensive. In this paper, a new method is presented for the computation of the third order intercept point. Using the proposed approach, the necessary Volterra kernels are computed directly from the harmonic balance equations. The only computation cost is that of solving a set of sparse linear equations. Furthermore, only one input tone is required in this case, which greatly reduces the size of the equations and thus the computation cost. Tille, D.; Eggersglus, S.; Drechsler, R., "Incremental Solving Techniques for SAT-based ATPG" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487475&isn... Abstract: Automatic test pattern generation (ATPG) based on the Boolean satisfiability (SAT) problem has recently been proven to be a beneficial complement to traditional methods. Efficient SAT techniques yield a robust fault classification. In this paper, we present methodologies to improve the efficiency of SAT-based ATPG. First, we give a detailed run time analysis of a state-of-the-art SAT-based ATPG tool. By only taking circuit partitions into account and applying incremental SAT solving, both SAT instance generation and SAT instance solving can be accelerated and the robustness of the ATPG process is increased. Besides the significant run time reduction of SAT-based ATPG, the methodology can additionally be used to improve the test set quality. The proposed techniques are applied for the stuck-at and for the transition fault model. A set of large industrial designs is used to show the efficiency of the approach. Yang, M.-H.; Cho, H.; Kang, W.; Kang, S., "EOF: Efficient Built-In Redundancy Analysis Methodology With Optimal Repair Rate" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487466&isn... Abstract: Faulty cell repair with redundancy can improve memory yield. In particular, built-in redundancy analysis (BIRA) is widely used to enhance the yield of embedded memories. We propose an efficient BIRA algorithm to achieve the optimal repair rate with a very short analysis time and low hardware cost. The proposed algorithm can significantly reduce the number of backtracks in the exhaustive search algorithm: it uses early termination based on the number of orthogonal faulty cells and fault classification in fault collection. Experimental results show that the proposed BIRA methodology can achieve optimal repair rate with low hardware overhead and short analysis time, as compared to previous BIRA methods. Pomeranz, I.; Reddy, S. M., "On Clustering of Undetectable Single Stuck-At Faults and Test Quality in Full-Scan Circuits" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487471&isn... Abstract: We demonstrate that undetectable single stuck-at faults in full-scan benchmark circuits tend to cluster in certain areas. This implies that certain areas may remain uncovered by a test set for single stuck-at faults. We describe an extension to the set of target faults aimed at providing a better coverage of the circuit in the presence of undetectable single stuck-at faults. The extended set of target faults consists of double stuck-at faults that include an undetectable fault as one of their components. The other component is a detectable fault adjacent to the undetectable fault. We present experimental results of fault simulation and test generation for the extended set of target faults. Xia, L.; Bell, I. M.; Wilkinson, A. J., "Automated Model Generation Algorithm for High-Level Fault Modeling" URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5487470&isn... Abstract: High-level modeling for operational amplifiers (opamps) has been previously carried out successfully using models generated by published automated model generation approaches. Furthermore, high-level fault modeling (HLFM) has been shown to work reasonably well using manually designed fault models. However, no evidence shows that published automated model generation approaches based on opamps have been used in HLFM. This paper describes HLFM for analog circuits using an adaptive self-tuning algorithm called multiple model generation system using delta. The generation algorithms and simulation models were written in MATLAB and the hardware description language VHDL-AMS, respectively. The properties of these self-tuning algorithms were investigated by modeling complementary metal-oxide-semiconductor opamps, and comparing simulations using the HLFM against those of the original simulation program with integrated circuit emphasis circuit utilizing transient analysis. Results show that the models can handle both linear and nonlinear fault situations with better accuracy than previously published HLFMs.