ABSTRACT We proposed MATEX, a distributed framework for transient simulation of power distributio... more ABSTRACT We proposed MATEX, a distributed framework for transient simulation of power distribution networks (PDNs). MATEX utilizes matrix exponential kernel with Krylov subspace approximations to solve differential equations of linear circuit. First, the whole simulation task is divided into subtasks based on decompositions of current sources, in order to reduce the computational overheads. Then these subtasks are distributed to different computing nodes and processed in parallel. Within each node, after the matrix factorization at the beginning of simulation, the adaptive time stepping solver is performed without extra matrix re-factorizations. MATEX overcomes the stiffness hinder of previous matrix exponential-based circuit simulator by rational Krylov subspace method, which leads to larger step sizes with smaller dimensions of Krylov subspace bases and highly accelerates the whole computation. MATEX outperforms both traditional fixed and adaptive time stepping methods, e.g., achieving around 13X over the trapezoidal framework with fixed time step for the IBM power grid benchmarks.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2015
ABSTRACT We propose an electrostatics-based placement algorithm for large-scale mixed-size circui... more ABSTRACT We propose an electrostatics-based placement algorithm for large-scale mixed-size circuits (ePlace-MS). ePlace-MS is generalized, flat, analytic and nonlinear. The density modeling method eDensity is extended to handle the mixed-size placement. We conduct detailed analysis on the correctness of the gradient formulation and the numerical solution, as well as the rationale of dc removal and the advantages over prior density functions. Nesterov's method is used as the nonlinear solver, which shows high yet stable performance over mixed-size circuits. The steplength is set as the inverse of Lipschitz constant of the gradient function, while we develop a backtracking method to prevent overestimation. An approximated nonlinear preconditioner is developed to minimize the topological and physical differences between large macros and standard cells. Besides, we devise a simulated annealer to legalize the layout of macros and use a second-phase global placement to reoptimize the standard cell layout. All the above innovations are integrated into our mixed-size placement prototype ePlace-MS, which outperforms all the related works in literature with better quality and efficiency. Compared to the leading-edge mixed-size placer NTUplace3, ePlace-MS produces up to 22.98% and on average 8.22% shorter wirelength over all the 16 modern mixed-size benchmark circuits with the same runtime.
Journal of Computational and Theoretical Nanoscience, 2011
Abstract: A physics based yet computation efficient core model for cylindrical undoped surroundin... more Abstract: A physics based yet computation efficient core model for cylindrical undoped surrounding-gate (SRG) MOSFET current-voltage and capacitance-voltage prediction is presented in this paper. This model is based on the exact surface potential solution of ...
2013 International Conference on Communications, Circuits and Systems (ICCCAS), 2013
ABSTRACT This paper describes a wearable sensing system to monitor biopotentials via noncontact c... more ABSTRACT This paper describes a wearable sensing system to monitor biopotentials via noncontact capacitive sensors that are suitable for long-term and ambulatory monitoring applications. To overcome motion-induced measurement artifacts typically encountered in such systems, a motion artifact suppression technique is introduced. Specifically, a sensor that consists of a pair of physically-interleaved capacitive channels is designed to have different amounts of parasitic input capacitance, creating channel-specific outputs that depend on the input coupling capacitance itself. Differences in output channel results can then be placed through a digital reconstruction filter to re-create the original biopotential with attenuated motion artifacts. To validate the system concept, a wireless ECG sensing system is designed. Simulation results indicate that motion-induced signal distortion is reduced by over 14X after reconstruction.
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013, 2013
ABSTRACT The floating random walk (FRW) algorithm is an important field-solver algorithm for capa... more ABSTRACT The floating random walk (FRW) algorithm is an important field-solver algorithm for capacitance extraction, which has several merits compared with other boundary element method (BEM) based algorithms. In this paper, the FRW algorithm is accelerated with the modern graphics processing units (GPUs). We propose an iterative GPU-based FRW algorithm flow and the technique using an inverse cumulative probability array (ICPA), to reduce the divergence among walks and the global-memory accessing. A variant FRW scheme is proposed to utilize the benefit of ICPA, so that it accelerates the extraction of multi-dielectric structures. The technique for extracting multiple nets concurrently is also discussed. Numerical results show that our GPU-based FRW brings over 20X speedup for various test cases with 0.5% convergence criterion over the CPU counterpart. For the extraction of multiple nets, our GPU-based FRW outperforms the CPU counterpart by up to 59X.
ABSTRACT We proposed MATEX, a distributed framework for transient simulation of power distributio... more ABSTRACT We proposed MATEX, a distributed framework for transient simulation of power distribution networks (PDNs). MATEX utilizes matrix exponential kernel with Krylov subspace approximations to solve differential equations of linear circuit. First, the whole simulation task is divided into subtasks based on decompositions of current sources, in order to reduce the computational overheads. Then these subtasks are distributed to different computing nodes and processed in parallel. Within each node, after the matrix factorization at the beginning of simulation, the adaptive time stepping solver is performed without extra matrix re-factorizations. MATEX overcomes the stiffness hinder of previous matrix exponential-based circuit simulator by rational Krylov subspace method, which leads to larger step sizes with smaller dimensions of Krylov subspace bases and highly accelerates the whole computation. MATEX outperforms both traditional fixed and adaptive time stepping methods, e.g., achieving around 13X over the trapezoidal framework with fixed time step for the IBM power grid benchmarks.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2015
ABSTRACT We propose an electrostatics-based placement algorithm for large-scale mixed-size circui... more ABSTRACT We propose an electrostatics-based placement algorithm for large-scale mixed-size circuits (ePlace-MS). ePlace-MS is generalized, flat, analytic and nonlinear. The density modeling method eDensity is extended to handle the mixed-size placement. We conduct detailed analysis on the correctness of the gradient formulation and the numerical solution, as well as the rationale of dc removal and the advantages over prior density functions. Nesterov's method is used as the nonlinear solver, which shows high yet stable performance over mixed-size circuits. The steplength is set as the inverse of Lipschitz constant of the gradient function, while we develop a backtracking method to prevent overestimation. An approximated nonlinear preconditioner is developed to minimize the topological and physical differences between large macros and standard cells. Besides, we devise a simulated annealer to legalize the layout of macros and use a second-phase global placement to reoptimize the standard cell layout. All the above innovations are integrated into our mixed-size placement prototype ePlace-MS, which outperforms all the related works in literature with better quality and efficiency. Compared to the leading-edge mixed-size placer NTUplace3, ePlace-MS produces up to 22.98% and on average 8.22% shorter wirelength over all the 16 modern mixed-size benchmark circuits with the same runtime.
Journal of Computational and Theoretical Nanoscience, 2011
Abstract: A physics based yet computation efficient core model for cylindrical undoped surroundin... more Abstract: A physics based yet computation efficient core model for cylindrical undoped surrounding-gate (SRG) MOSFET current-voltage and capacitance-voltage prediction is presented in this paper. This model is based on the exact surface potential solution of ...
2013 International Conference on Communications, Circuits and Systems (ICCCAS), 2013
ABSTRACT This paper describes a wearable sensing system to monitor biopotentials via noncontact c... more ABSTRACT This paper describes a wearable sensing system to monitor biopotentials via noncontact capacitive sensors that are suitable for long-term and ambulatory monitoring applications. To overcome motion-induced measurement artifacts typically encountered in such systems, a motion artifact suppression technique is introduced. Specifically, a sensor that consists of a pair of physically-interleaved capacitive channels is designed to have different amounts of parasitic input capacitance, creating channel-specific outputs that depend on the input coupling capacitance itself. Differences in output channel results can then be placed through a digital reconstruction filter to re-create the original biopotential with attenuated motion artifacts. To validate the system concept, a wireless ECG sensing system is designed. Simulation results indicate that motion-induced signal distortion is reduced by over 14X after reconstruction.
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013, 2013
ABSTRACT The floating random walk (FRW) algorithm is an important field-solver algorithm for capa... more ABSTRACT The floating random walk (FRW) algorithm is an important field-solver algorithm for capacitance extraction, which has several merits compared with other boundary element method (BEM) based algorithms. In this paper, the FRW algorithm is accelerated with the modern graphics processing units (GPUs). We propose an iterative GPU-based FRW algorithm flow and the technique using an inverse cumulative probability array (ICPA), to reduce the divergence among walks and the global-memory accessing. A variant FRW scheme is proposed to utilize the benefit of ICPA, so that it accelerates the extraction of multi-dielectric structures. The technique for extracting multiple nets concurrently is also discussed. Numerical results show that our GPU-based FRW brings over 20X speedup for various test cases with 0.5% convergence criterion over the CPU counterpart. For the extraction of multiple nets, our GPU-based FRW outperforms the CPU counterpart by up to 59X.
Uploads
Papers