[go: up one dir, main page]

CN109791627B - Semiconductor device modeling for training deep neural networks using input preprocessing and transformation targets - Google Patents

Semiconductor device modeling for training deep neural networks using input preprocessing and transformation targets Download PDF

Info

Publication number
CN109791627B
CN109791627B CN201880001122.9A CN201880001122A CN109791627B CN 109791627 B CN109791627 B CN 109791627B CN 201880001122 A CN201880001122 A CN 201880001122A CN 109791627 B CN109791627 B CN 109791627B
Authority
CN
China
Prior art keywords
neural network
input
drain current
transistor
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201880001122.9A
Other languages
Chinese (zh)
Other versions
CN109791627A (en
Inventor
雷源
霍晓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hong Kong Applied Science and Technology Research Institute ASTRI
Original Assignee
Hong Kong Applied Science and Technology Research Institute ASTRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US16/011,787 external-priority patent/US11176447B2/en
Application filed by Hong Kong Applied Science and Technology Research Institute ASTRI filed Critical Hong Kong Applied Science and Technology Research Institute ASTRI
Publication of CN109791627A publication Critical patent/CN109791627A/en
Application granted granted Critical
Publication of CN109791627B publication Critical patent/CN109791627B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Insulated Gate Type Field-Effect Transistor (AREA)

Abstract

一种基于深度神经网络的半导体器件建模系统和方法。通过对测试晶体管的测量来收集训练数据,所述训练数据包括栅极和漏极电压、晶体管宽度和长度、以及在上述输入条件下测量的漏极电流。所述训练数据通过输入预处理模块进行转换,所述输入预处理模块可采用对输入数据求对数或执行主成分分析(PCA)的方法。在训练深度神经网络时,不是将测量漏极电流用作拟合目标,而是将目标转换模块产生的转换漏极电流作为拟合目标,转换方法例如漏极电流关于栅极或漏极电压的导数,或其导数的对数。将输入数据经过深度神经网络的输出与转换漏极电流进行比较得到总误差,通过在训练期间对深度神经网络的权重进行调整,来不断减小总误差值,最终完成训练。

Figure 201880001122

A semiconductor device modeling system and method based on a deep neural network. Training data is collected from measurements of test transistors, including gate and drain voltages, transistor width and length, and drain current measured under the above input conditions. The training data is transformed by an input preprocessing module, which may employ a method of taking the logarithm of the input data or performing a principal component analysis (PCA). When training the deep neural network, instead of using the measured drain current as the fitting target, the converted drain current generated by the target conversion module is used as the fitting target. Derivative, or the logarithm of its derivative. The total error is obtained by comparing the output of the input data through the deep neural network with the converted drain current. By adjusting the weight of the deep neural network during the training period, the total error value is continuously reduced, and the training is finally completed.

Figure 201880001122

Description

使用输入预处理和转换目标用于训练深度神经网络的半导体 器件建模Semiconductors for Training Deep Neural Networks Using Input Preprocessing and Transforming Targets Device modeling

【技术领域】【Technical field】

本发明涉及半导体器件建模,特别涉及使用人工神经网络来对器件进行建模。The present invention relates to semiconductor device modeling, and in particular to modeling devices using artificial neural networks.

【背景技术】【Background technique】

单个集成电路(IC)可能包含一百万个晶体管。每个晶体管通常是形成在半导体衬底上的金属氧化物半导体场效应晶体管(MOSFET)或其变体。在IC设计过程中,会创建一个网表或原理图来详细说明这些晶体管与其他元件如电容和电阻之间的连接。然后可以使用电路仿真器来仿真网表,电路仿真器使用器件模型来仿真每个晶体管的运行。A single integrated circuit (IC) may contain a million transistors. Each transistor is typically a metal oxide semiconductor field effect transistor (MOSFET) or a variant thereof formed on a semiconductor substrate. During the IC design process, a netlist or schematic is created to detail the connections between these transistors and other components such as capacitors and resistors. The netlist can then be simulated using a circuit simulator, which uses the device model to simulate the operation of each transistor.

器件模型估计晶体管的电特性,例如随栅极和漏极电压变化的漏极电流。更精确的仿真可以使用更精确的模型来估计其他参数如寄生电容,以更好地估计延迟和电路时序。The device model estimates the electrical properties of the transistor, such as drain current as a function of gate and drain voltages. More accurate simulations can use more accurate models to estimate other parameters such as parasitic capacitances to better estimate delays and circuit timing.

一个重要的模拟器是集成电路重点仿真程序(SPICE),最初由加利福尼亚大学伯克利分校于1975年开发。自那时以来,SPICE得到了扩展和增强,并有多种变体。伯克利短沟道IGFET模型(BSIM)是另一种特别适用于小尺寸晶体管器件的模型。An important simulator is the Program for Emulation of Integrated Circuits Emphasis (SPICE), originally developed in 1975 by the University of California, Berkeley. SPICE has been expanded and enhanced since then, and has many variants. The Berkeley short-channel IGFET model (BSIM) is another model that is particularly well suited for small-scale transistor devices.

测试电路(例如集成电路带有的可以手动用探针探测测试焊盘的晶体管)允许器件工程师手动探测和测试这些器件,以测量随电压变化的电流。使用这些测试结果,器件工程师可以确定器件参数用于SPICE或BSIM器件模型,并将这些参数用于仿真更大规模的IC。尽管这些手动测量被自动测量取代,但提取供SPICE或BSIM模型使用的器件参数仍然非常耗时。Test circuits, such as transistors with integrated circuits whose test pads can be manually probed, allow device engineers to manually probe and test these devices to measure current as a function of voltage. Using these test results, device engineers can determine device parameters for use in SPICE or BSIM device models and use these parameters to simulate larger scale ICs. Although these manual measurements are replaced by automated measurements, extracting device parameters for use with SPICE or BSIM models is still time-consuming.

随着器件尺寸缩小,基本的一阶器件模型未能准确地估计较小器件的电流。短沟道长度、埋层和亚微米几何形状所引起的二阶效应需要新的参数和更复杂的器件建模方程。需要添加并测试更多具有不同尺寸和形状的测试设备,以获得这些附加参数的值。自动测量设备允许更快地提取器件模型参数。As device sizes shrink, basic first-order device models fail to accurately estimate current for smaller devices. Second-order effects caused by short channel lengths, buried layers, and submicron geometries require new parameters and more complex device modeling equations. More test equipment of different sizes and shapes needs to be added and tested to obtain values for these additional parameters. Automatic measurement equipment allows faster extraction of device model parameters.

随着器件尺寸不断缩小,栅长仅为10纳米或更小的器件对器件建模提出了额外的挑战,因为器件尺寸接近半导体衬底中的原子尺寸。正在使用的新半导体材料,如氮化镓(GaN)、砷化镓(GaAs)和碳化硅(SiC),其物理性质与硅不同。特殊器件,如鳍式场效应晶体管(FinFET)和绝缘体上硅(SOI),有三维电流流动,使用较旧的二维电流模型不能精确建模。测量不同尺寸和形状的测试器件的实际电流,对创建有用的器件模型至关重要。As device dimensions continue to shrink, devices with gate lengths of 10 nanometers or less present additional challenges for device modeling as device dimensions approach the atomic dimensions in semiconductor substrates. New semiconductor materials are being used, such as gallium nitride (GaN), gallium arsenide (GaAs) and silicon carbide (SiC), which have different physical properties than silicon. Special devices, such as fin field-effect transistors (FinFETs) and silicon-on-insulator (SOI), have three-dimensional current flow that cannot be accurately modeled using older two-dimensional current models. Measuring the actual current of test devices of different sizes and shapes is critical to creating useful device models.

最近,人工神经网络(ANN)被用于生成器件模型和选择模型的参数。人工神经网络特别适用于处理大量数据,这些方式比较复杂很难通过使用传统计算机程序来定义。不是利用指令进行编程,而是将训练数据输入到神经网络,并与预期输出进行比较,然后在神经网络内进行调整,接着再次处理训练数据并进行比较,以产生对神经网络的进一步调整。在经历多次这样的训练周期之后,经过训练的神经网络可以有效地处理类似于训练数据和预期输出的数据。神经网络是机器学习的一个例子,因为神经网络学习如何生成训练数据的预期输出。接着,可以将与训练数据相似的实际数据输入到神经网络来处理实时数据。More recently, artificial neural networks (ANNs) have been used to generate device models and select model parameters. Artificial neural networks are particularly useful for processing large amounts of data in ways that are complex and difficult to define using traditional computer programs. Instead of programming with instructions, training data is fed into the neural network and compared to the expected output, adjustments are made within the neural network, and the training data is processed again and compared to produce further adjustments to the neural network. After many such training cycles, the trained neural network can efficiently process data similar to the training data and expected output. A neural network is an example of machine learning because a neural network learns how to generate the expected output from the training data. Next, real data similar to the training data can be fed into the neural network to process the real-time data.

图1显示一个现有技术的神经网络。输入节点102、104、106、108接收输入数据I1、I2、I3、...、I4,而输出节点103、105、107、109输出神经网络的运行结果,输出数据O1、O2、O3、...、O4。在这个神经网络包含三层中间层。节点110、112、114、116、118,各自从输入节点102、104、106、108中的一个或多个获取输入,执行一些诸如加、减、乘或更复杂运算之类的操作,并发送和输出到第二层的节点。第二层节点120、122、124、126、128、129也接收多个输入,合并这些输入以生成输出,并将输出发送到第三层节点132、134、136、138、139,其类似地合并输入并生成输出。Figure 1 shows a prior art neural network. Input nodes 102, 104, 106, 108 receive input data I 1 , I 2 , I 3 , . O 2 , O 3 , ..., O 4 . In this neural network there are three intermediate layers. Nodes 110, 112, 114, 116, 118, each take input from one or more of input nodes 102, 104, 106, 108, perform some operation such as add, subtract, multiply or more complex operations, and send and outputs to the nodes of the second layer. Second tier nodes 120, 122, 124, 126, 128, 129 also receive multiple inputs, combine these inputs to generate outputs, and send the outputs to third tier nodes 132, 134, 136, 138, 139, which similarly Merge inputs and generate outputs.

通常对每层输入会进行加权,因此在每个节点处产生了加权后的总和(或其他加权运算结果)。对节点上的每个输入分配一个权重,该权重与该输入相乘,然后由该节点将所有加权后的输入一起相加、相乘或其他运算,以产生该节点的输出。这些权重可以被指定为W31、W32、W32、W33、...、W41等,并在训练过程中调整其值。通过反复试验或其他训练程序,最终对于产生预期输出的路径,可以给出更高的权重,而将更小的权重分配给不产生预期输出的路径。机器会学习哪些路径产生期望输出并将高权重分配给这些路径的输入。The inputs to each layer are usually weighted so that a weighted sum (or other weighted operation result) is produced at each node. Each input on a node is assigned a weight, that weight is multiplied by that input, and all weighted inputs are added, multiplied, or otherwise operated on together by the node to produce the node's output. These weights can be designated as W31 , W32 , W32 , W33 , . . . , W41 , etc., and their values adjusted during training. Through trial and error or other training procedures, eventually paths that produce the expected output can be given higher weights, while paths that do not produce the expected output can be assigned smaller weights. The machine learns which paths produce the desired output and assigns high weights to the inputs of those paths.

这些权重可以存储在权重存储器100中。由于神经网络通常具有许多节点,因此权重存储器100中存储有许多权重。每个权重可能需要多个二进制比特来表示该权重可能值的范围。权重通常需要8到16比特。These weights may be stored in the weight memory 100 . Since neural networks typically have many nodes, many weights are stored in the weight memory 100 . Each weight may require multiple binary bits to represent the range of possible values for that weight. Weights typically require 8 to 16 bits.

图2显示一个晶体管器件模型。栅极电压Vg和漏极电压Vd被施加到晶体管,而源极电压Vs通常被接地。衬底或体电压Vb可以接地,或是另一电压如反向偏置。器件模型使用各种参数来预测漏极电流Ids(drain to source current),其是Vg、Vd、Vb、Vs的函数。其他输入如温度T、栅宽W和栅长L也会影响预测漏极电流,特别是当L非常小时。Figure 2 shows a transistor device model. The gate voltage Vg and drain voltage Vd are applied to the transistor, while the source voltage Vs is normally grounded. The substrate or bulk voltage Vb can be ground, or another voltage such as reverse bias. The device model uses various parameters to predict the drain current Ids (drain to source current), which is a function of Vg, Vd, Vb, Vs. Other inputs such as temperature T, gate width W and gate length L also affect the predicted drain current, especially when L is very small.

图3显示对器件建模时的过度拟合问题。测量数据204、206、208被输入到一个神经网络中以产生最适合测量数据204、206、208的模型参数。模型电流202是测量数据204、206、208的最佳拟合模型。测量数据206、208是两个异常数据点,其可能是某种测量误差的结果。使用神经网络来拟合包括异常测量数据206、208在内的所有数据点,导致模型电流202尖峰向下到异常测量数据208,然后急剧上升到异常数据点206,再向下到测量数据204。异常测量数据206、208导致模型电流202在异常数据点周围有一个负电导。而且,超出测量数据204的模型电流202可能是不可靠的,倾向于不稳定的行为。可扩展性差。Figure 3 shows the overfitting problem when modeling the device. The measurement data 204 , 206 , 208 are input into a neural network to generate model parameters that best fit the measurement data 204 , 206 , 208 . The model current 202 is the best fit model of the measurement data 204 , 206 , 208 . The measurement data 206, 208 are two outlier data points, which may be the result of some measurement error. Fitting all data points including anomalous measurement data 206 , 208 using a neural network causes model current 202 to spike down to anomalous measurement data 208 , then sharply rise to anomalous data point 206 , and down to measurement data 204 . The abnormal measurement data 206, 208 cause the model current 202 to have a negative conductance around the abnormal data points. Also, model currents 202 that exceed measured data 204 may be unreliable, prone to erratic behavior. Poor scalability.

图4A-4B显示在模型达到最小值时的较差的建模。在图4A,绘制了是漏极-源极电压的函数的漏极电流。训练数据214和测试数据212是测量数据点,训练数据214被输入到神经网络以生成权重值,而测试数据212用于测试神经网络权重值的准确性。218是通过输入不同栅极电压Vg所生成的模型。尽管模型218的精度如图4A所示在电流较大时看起来较好,但如图4B所示在电流较小时的模型218的精度很差。模型218不是收敛在原点(0,0),而是原点附近。对处于亚阈值区的电压和电流而言,模型218是失败的。4A-4B show poor modeling when the model reaches a minimum value. In Figure 4A, the drain current is plotted as a function of drain-source voltage. Training data 214 and test data 212 are measurement data points, the training data 214 is input to the neural network to generate weight values, and the test data 212 is used to test the accuracy of the neural network weight values. 218 is a model generated by inputting different gate voltages Vg. While the accuracy of the model 218 appears to be better at higher currents as shown in FIG. 4A , the accuracy of the model 218 at lower currents as shown in FIG. 4B is poor. The model 218 does not converge at the origin (0,0), but near the origin. Model 218 fails for voltages and currents in the subthreshold region.

图5显示使用漏极电流作为拟合目标来训练神经网络以产生一个器件模型。对测试晶体管进行测量,并记录输入电压、温度、沟道宽度W和沟道长度L作为训练数据34,将测量得到的漏极电流Ids记录为与输入数据组合相对应的拟合目标数据38。神经网络36接收训练数据34和一个当前权重集,并对训练数据34进行操作以产生一个结果。神经网络36产生的结果通过损失函数42与拟合目标数据38进行比较,损失函数42产生一个损失值,其是显示所产生的结果距拟合目标的误差。损失函数42产生的损失值被用来调整权重,施加到神经网络36。通过损失函数42应用到训练数据34上,权重进行多次迭代,直到找出一个最小损失值,并将最终权重集用于晶体管模型。Figure 5 shows training a neural network to produce a device model using drain current as a fitting target. The test transistors are measured, and the input voltage, temperature, channel width W, and channel length L are recorded as training data 34, and the measured drain current Ids is recorded as fitting target data 38 corresponding to the input data combination. Neural network 36 receives training data 34 and a current set of weights, and operates on training data 34 to produce a result. The results produced by the neural network 36 are compared to the fit target data 38 by a loss function 42, which produces a loss value that indicates the error of the produced results from the fit target. The loss values produced by the loss function 42 are used to adjust the weights applied to the neural network 36 . By applying the loss function 42 to the training data 34, the weights are iterated many times until a minimum loss value is found, and the final set of weights is used for the transistor model.

器件模型期望在一个在更宽广的范围(从亚阈值到强反型区)上准确。但是,使用神经网络会导致图3的过度拟合问题和图4B的亚阈值准确度的问题。此外,某些电路仿真器使用模型的导数或斜率如电导(Gds)和跨导(Gm),但模型收敛问题可能会使提取的电导(Gds)和跨导(Gm)值出现失真。模型的一阶导数可能精度较差。单调性可能很差。为了避免过度拟合和不好的单调性问题,可能需要限制隐藏层的尺寸,这使得深度神经网络的使用变得困难。但是,浅层神经网络不能应用于更复杂的模型,如果仍想得到准确的模型的话。Device models are expected to be accurate over a wider range (from subthreshold to strong inversion regions). However, using a neural network leads to the overfitting problem of Figure 3 and the sub-threshold accuracy of Figure 4B. Also, some circuit simulators use model derivatives or slopes such as conductance (Gds) and transconductance (Gm), but model convergence issues may distort the extracted conductance (Gds) and transconductance (Gm) values. The first derivative of the model may have poor accuracy. Monotonicity can be bad. To avoid overfitting and bad monotonicity issues, it may be necessary to limit the size of the hidden layers, which makes the use of deep neural networks difficult. However, shallow neural networks cannot be applied to more complex models if an accurate model is still desired.

所期望的是有一种用于半导体集成电路(IC)的器件模型,其准确地模拟一个宽范围内的电流,包括亚阈值区。神经网络产生的器件模型是令人期望的,但不要有过度拟合的问题。一种能够精确建模电导(Gds)和跨导(Gm)值的器件模型是令人期望的。It is desirable to have a device model for semiconductor integrated circuits (ICs) that accurately simulates a wide range of currents, including the subthreshold region. The device models produced by the neural network are desirable, but do not have problems with overfitting. A device model capable of accurately modeling conductance (Gds) and transconductance (Gm) values is desirable.

【附图说明】【Description of drawings】

图1显示一个现有技术的神经网络。Figure 1 shows a prior art neural network.

图2显示一个晶体管器件模型。Figure 2 shows a transistor device model.

图3显示器件建模时的过度拟合问题。Figure 3 shows the overfitting problem when modeling the device.

图4A-4B显示接近最小值时准确度较差的模型。Figures 4A-4B show less accurate models near the minimum.

图5显示使用漏极电流作为目标训练神经网络,以产生一个器件模型。Figure 5 shows training a neural network using the drain current as the target to produce a device model.

图6是对半导体漏极电流进行转换的人工神经网络的示意图。FIG. 6 is a schematic diagram of an artificial neural network that converts semiconductor drain currents.

图7显示对预处理输入进行操作并使用转换目标的损失函数来调整权重的一个深度神经网络。Figure 7 shows a deep neural network that operates on preprocessed inputs and uses a loss function that transforms the target to adjust the weights.

图8显示使用转换的漏极电流作为深度神经网络的拟合目标解决器件建模时的过度拟合问题。Figure 8 shows using the transformed drain current as a fitting target for a deep neural network to address overfitting in device modeling.

图9A-9B显示使用转换漏极电流作为拟合目标可以使得深度神经网络能够更好地对亚阈区晶体管进行建模。Figures 9A-9B show that using the transformed drain current as a fitting target enables deep neural networks to better model subthreshold transistors.

图10显示一种晶体管仿真器,其是基于预处理输入运算并以转换漏极电流为拟合目标的深神经网络而获得模型和参数。Figure 10 shows a transistor simulator that obtains models and parameters based on a deep neural network that preprocesses input operations and is fitted to convert drain current.

【具体实施方式】【Detailed ways】

本发明涉及使用人工神经网络对半导体器件建模的改进。以下描述使本领域普通技术人员能够制造和使用在特定应用及其要求的上下文中所提供的本发明。对本领域技术人员而言,对优选实施例的各种修改将是显而易见的,本发明所定义的一般原理可以应用于其他实施例。因此,本发明并非旨在受限于所示和所述的特定实施例,而是应被赋予与本发明公开的原理和新颖特征一致的最宽范围。The present invention relates to improvements in the modeling of semiconductor devices using artificial neural networks. The following description enables one of ordinary skill in the art to make and use the invention provided in the context of a particular application and its requirements. Various modifications to the preferred embodiment will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

图6是以经过变换的半导体器件漏极电流作为拟合目标的人工神经网络的示意图。虽然漏极电流在某些范围内近似线性,但漏极电流可能在某些更大的范围内是非线性的。发明人认识到,与通常用于晶体管一阶模型不同,漏极电流是非线性的。发明人相信,使用漏极电流作为神经网络的拟合目标,将会产生亚阈值电流的模型精度问题、过度拟合、较差的单调性和收敛性问题。FIG. 6 is a schematic diagram of an artificial neural network with the transformed semiconductor device drain current as a fitting target. While the drain current is approximately linear in some ranges, the drain current may be non-linear in some larger ranges. The inventors have recognized that, unlike first-order models commonly used for transistors, the drain current is non-linear. The inventors believe that using drain current as a fitting target for a neural network will create model accuracy issues, overfitting, poor monotonicity and convergence issues for subthreshold currents.

发明人已经发现,通过对目标的转换有利于模型生成和所得模型精度。不使用漏极电流Ids作为目标,而是转换漏极电流,并由损失函数42使用转换的漏极电流来生成损失函数并调整权重。The inventors have found that model generation and resulting model accuracy are facilitated by transforming the target. Instead of using the drain current Ids as a target, the drain current is converted and the converted drain current is used by the loss function 42 to generate a loss function and adjust the weights.

目标数据38中的测量的漏极电流被目标转换模块44转换为转换漏极电流X_Ids。目标转换模块44通过求导的方式对漏极电流进行转换。转换过程可以是漏极电流对栅极电压,漏极电压,衬底电压,或者晶体管的尺寸和温度求导。求上述导数的对数也是一种转换方法。损失函数42计算神经网络输出与其预期输出之间的总差值,预期输出就是转换漏极电流X_Ids。应用于深度神经网络50的权重通过一些优化算法如随机梯度下降算法来调整,以使由损失函数42计算的总差值越来越小。由损失函数42计算的转换后漏极电流与神经网络输出之间差值,可能会比漏极电流与神经网络输出之间的差值更小。The measured drain current in target data 38 is converted by target conversion module 44 to converted drain current X_Ids. The target conversion module 44 converts the drain current by means of derivation. The conversion process can be the derivation of the drain current with respect to the gate voltage, drain voltage, substrate voltage, or the size and temperature of the transistor. Taking the logarithm of the above derivative is also a conversion method. The loss function 42 calculates the total difference between the neural network output and its expected output, which is the transition drain current X_Ids. The weights applied to the deep neural network 50 are adjusted by some optimization algorithm, such as stochastic gradient descent, so that the total difference calculated by the loss function 42 becomes smaller and smaller. The difference between the transformed drain current and the neural network output, calculated by the loss function 42, may be smaller than the difference between the drain current and the neural network output.

训练数据34包括漏极至源极电压Vds、栅极至源极电压Vgs、衬底至源极电压Vbs、温度、晶体管沟道宽度W和长度L,以及在上述条件下通过测量得到的漏极至源极电流Ids。这些输入电压和条件由输入预处理模块40进行处理以生成预处理输入数据,并以此作为深度神经网络50的输入。Training data 34 includes drain-to-source voltage Vds, gate-to-source voltage Vgs, substrate-to-source voltage Vbs, temperature, transistor channel width W and length L, and drain measured under the above conditions to the source current Ids. These input voltages and conditions are processed by the input preprocessing module 40 to generate preprocessed input data as input to the deep neural network 50 .

输入预处理模块40可以对训练数据34进行多种预处理,如求输入电压的自然对数:ln(Vgs),ln(Vgs)。输入预处理模块40可以在训练数据34上进行主成分分析(PCA),以获得最能够影响经过转换的漏极电流的主变量。PCA能检测到哪个输入变量对漏极电流影响最大。PCA可以使用协方差矩阵的特征向量来减小变量维数。The input preprocessing module 40 can perform various preprocessing on the training data 34, such as finding the natural logarithm of the input voltage: ln(Vgs), ln(Vgs). The input preprocessing module 40 may perform a principal component analysis (PCA) on the training data 34 to obtain the principal variables that most affect the transformed drain current. PCA can detect which input variable has the most influence on the drain current. PCA can use the eigenvectors of the covariance matrix to reduce the variable dimension.

深度神经网络50可以是由工程师针对特定应用产生的一个神经网络、或针对特定应用而调整的一个通用神经网络。例如,可以针对特定应用调整神经网络中的中间或隐藏层的数量,可以针对某些应用或需要解决的问题例如半导体器件建模来调整节点中执行的运算类型和节点间连接方式,虽然通常用浅层神经网络建模漏极电流Ids,但深层神经网络50却包含有至少5层中间层的深层神经网络。有更多中间层的深层神经网络50,会允许更好地模拟二阶效应,如在半导体器件中的埋层和复杂器件中的三维电流流动。The deep neural network 50 may be a neural network generated by an engineer for a specific application, or a general neural network tuned for a specific application. For example, the number of intermediate or hidden layers in a neural network can be adjusted for a specific application, the type of operations performed in a node and how nodes are connected can be adjusted for certain applications or problems that need to be solved, such as semiconductor device modeling. The shallow neural network models the drain current Ids, but the deep neural network 50 contains a deep neural network with at least 5 intermediate layers. A deep neural network 50 with more intermediate layers would allow better modeling of second-order effects such as buried layers in semiconductor devices and three-dimensional current flow in complex devices.

深度神经网络50中使用的初始权重,可以被设置为初始范围如(-1.0至1.0)内的随机值。训练数据34由输入预处理模块40进行预处理,预处理后的数据在执行训练时输入到深度神经网络50,使得深度神经网络50的输出结果被评估。The initial weights used in the deep neural network 50 can be set to random values within an initial range such as (-1.0 to 1.0). The training data 34 is preprocessed by the input preprocessing module 40, and the preprocessed data is input to the deep neural network 50 when performing training, so that the output of the deep neural network 50 is evaluated.

评估深度神经网络50的输出结果的质量的一种方法是计算损失。损失函数42可以计算损失值,其可以衡量当前循环的输出有多接近预期结果。通过将单个输出差值(输出与期望值之间的差值)进行平方,并对所有输出的这些平方进行平均或求和,可以产生一个均方误差(MSE)。One way to assess the quality of the output of the deep neural network 50 is to calculate a loss. Loss function 42 can calculate a loss value, which can measure how close the output of the current loop is to the expected result. A mean squared error (MSE) is created by squaring a single output difference (the difference between the output and the expected value) and averaging or summing these squares over all outputs.

训练的目标是找到使网络输出(预测值)与拟合目标(数据)相同或接近的权重值。这个过程很复杂,所以不可能用数学方法来计算权重。但是计算机可以从数据中学习。经过预处理或目标转换之后,数据被分成输入和拟合目标。首先权重被设置为随机初始值。当一个输入向量呈现给神经网络时,该值通过神经网络逐层向前传播,直到它到达输出层。然后使用损失函数将神经网络的输出与拟合目标进行比较。单输出的损失函数是1/2|y-y’|2,其中y是神经网络输出,y'是拟合目标。n个输入数据的损失函数E是单个输入的损失的平均值:E=1/2n*∑|y-y’|2。在确定n个输入的损失之后,可以使用优化算法来调整权重并使损失最小化。优化算法重复该两阶段周期,正向计算(传播)和权重更新。前向计算用于计算总损失。在第二阶段,优化方法如梯度下降用于更新权重以尝试减少损失。当总损失降低到可接受值以下时,停止这些循环。The goal of training is to find weight values that make the network output (predicted values) the same or close to the fit target (data). The process is complex, so it is impossible to calculate the weights mathematically. But computers can learn from data. After preprocessing or target transformation, the data is split into input and fitted targets. First the weights are set to random initial values. When an input vector is presented to the neural network, the value is propagated forward through the neural network layer by layer until it reaches the output layer. The output of the neural network is then compared to the fitted target using a loss function. The loss function for a single output is 1/2|y-y'| 2 , where y is the neural network output and y' is the fitting target. The loss function E for n input data is the average of the losses for a single input: E=1/2n*∑|y-y'| 2 . After determining the loss for n inputs, an optimization algorithm can be used to adjust the weights and minimize the loss. The optimization algorithm repeats this two-stage cycle, forward computation (propagation) and weight update. Forward computation is used to calculate the total loss. In the second stage, optimization methods such as gradient descent are used to update the weights in an attempt to reduce the loss. These cycles are stopped when the total loss falls below an acceptable value.

损失函数42产生的损失函数也可以包括复杂性损失,该复杂性损失包括一个权重衰减函数(在调整权重范围时防止过拟合)、和一个稀疏函数(用于改善深度神经网络50内结构和正则性)。复杂性损失可以防止模型过拟合,例如包括异常测量数据206、208(图3),因为与排除了异常测量数据206、208的更平滑的模型相比,包含异常测量数据206、208的模型更加复杂。The loss function generated by the loss function 42 may also include a complexity loss including a weight decay function (to prevent overfitting when adjusting the weight range), and a sparse function (used to improve the structure and regularity). The complexity loss can prevent overfitting of the model, for example including abnormal measurement data 206 , 208 ( FIG. 3 ), because a model that includes abnormal measurement data 206 , 208 is compared to a smoother model that excludes abnormal measurement data 206 , 208 more complicated.

准确性损失和复杂性损失都可以生成作为损失函数42的一部分,损失函数42在训练过程中调整下一个训练周期的权重。更新的权重应用于深度神经网络50,输入预处理模块40预处理的训练数据34再次输入到深度神经网络50,深度神经网络50产生一组新的损失值,该损失值通过损失函数42比较输出结果与转换漏极电压产生。在多次循环中调整权重和重新计算损失值,直到达到预期的终点。然后,应用于深度神经网络50的最终权重可以用来构建最终器件模型。Both the accuracy loss and the complexity loss can be generated as part of the loss function 42, which adjusts the weights for the next training epoch during training. The updated weights are applied to the deep neural network 50, and the training data 34 preprocessed by the input preprocessing module 40 is input again to the deep neural network 50, and the deep neural network 50 generates a new set of loss values, which are compared to the output by the loss function 42 The result is generated with the converted drain voltage. Adjust the weights and recalculate the loss value in multiple loops until the desired end point is reached. The final weights applied to the deep neural network 50 can then be used to construct the final device model.

图7显示了对预处理输入的操作和利用转换后的拟合目标的损失函数来调整权重的一个深度神经网络。训练数据34包含来被测量的输入条件包括栅极、漏极、体电压、温度和晶体管尺寸等。输入预处理模块40处理训练数据34产生的预处理数据被输入到深神经网络50。这些预处理输入可以包括漏源电压的对数、栅源电压的对数、以及通过主成分分析(PCA)选择或组合的其他输入。Figure 7 shows a deep neural network that operates on preprocessed inputs and uses the transformed loss function to fit the target to adjust the weights. The training data 34 contains input conditions to be measured including gate, drain, bulk voltage, temperature, and transistor size, among others. The preprocessed data generated by the processing of the training data 34 by the input preprocessing module 40 is input to the deep neural network 50 . These preprocessing inputs may include the logarithm of the drain-source voltage, the logarithm of the gate-source voltage, and other inputs selected or combined by principal component analysis (PCA).

深度神经网络50使用由损失函数42通过优化算法调整过的一组权重,对来自输入预处理模块40的预处理后的输入数据进行操作。深度神经网络50可以正向操作,其中每组预处理输入由深度神经网络50内的每个节点(权重介于-1和1)操作,以产生一个或多个输出,这些输出由损失函数42分析以生成一组新权重。新权重组被反向传送到深度神经网络50,使得当深度神经网络50的输入保持固定时,其输出随着权重的改变而改变。The deep neural network 50 operates on the preprocessed input data from the input preprocessing module 40 using a set of weights adjusted by the loss function 42 through an optimization algorithm. The deep neural network 50 can operate forward, where each set of preprocessing inputs is operated on by each node (with a weight between -1 and 1) within the deep neural network 50 to produce one or more outputs, which are determined by the loss function 42 Analysis to generate a new set of weights. The new set of weights is passed back to the deep neural network 50 so that while the input of the deep neural network 50 remains fixed, its output changes as the weights change.

损失函数42并不将神经网络输出与测量的漏极电流进行比较。相反,目标数据38中测量的漏极电流被目标转换模块44转换为转换漏极电流。损失函数42比较转换漏极电流与神经网络输出。The loss function 42 does not compare the neural network output to the measured drain current. Instead, the drain current measured in target data 38 is converted by target conversion module 44 to converted drain current. The loss function 42 compares the converted drain current to the neural network output.

如图7所示,由目标转换器44产生的、并被损失函数42用来与深度神经网络50的输出进行比较的转换漏极电流,可以是漏极电流的导数。这些漏极电流的转换可以包括漏极电流关于漏源电压的导数d(Ids)/d(Vds)、漏极电流关于于栅源电压的导数d(Ids)/d(Vgs)、漏极电流关于衬源极电压的导数d(Ids)/d(Vbs)、漏极电流关于温度的导数d(Ids)/dT、漏极电流关于晶体管沟道长度的导数d(Ids)/dL、以及漏极电流关于晶体管沟道宽度的导数d(Ids)/dW。As shown in FIG. 7 , the converted drain current produced by the target converter 44 and used by the loss function 42 to be compared with the output of the deep neural network 50 may be a derivative of the drain current. The conversion of these drain currents may include the derivative of the drain current with respect to the drain-source voltage d(Ids)/d(Vds), the derivative of the drain current with respect to the gate-source voltage d(Ids)/d(Vgs), the drain current Derivative d(Ids)/d(Vbs) with respect to pad-source voltage, d(Ids)/dT of drain current with respect to temperature, d(Ids)/dL of drain current with respect to transistor channel length, and drain The derivative of the pole current with respect to the transistor channel width, d(Ids)/dW.

目标变换器44对漏极电流的转换,也可以是这些导数中的任一导数的对数。例如,漏极电流关于漏源电压的导数的自然对数ln[d(Ids)/d(Vds)]、漏极电流关于栅源电压的导数的对数ln[d(Ids)/d(Vgs)]等。The conversion of the drain current by the target inverter 44 may also be the logarithm of any of these derivatives. For example, the natural logarithm of the derivative of the drain current with respect to the drain-source voltage ln[d(Ids)/d(Vds)], the logarithm of the derivative of the drain current with respect to the gate-source voltage ln[d(Ids)/d(Vgs )]Wait.

使用有更多层(更深)的深度神经网络50允许更精确地建模多个特征,这些特征可能出现在亚10-nm晶体管和三维晶体管,如鳍式场效应晶体管(FinFET)和绝缘体上硅(SOI)。深度神经网络50提供一个框架为更复杂的未来半导体工艺建模。Using a deep neural network 50 with more layers (deeper) allows more accurate modeling of multiple features that may occur in sub-10-nm transistors and 3D transistors such as Fin Field Effect Transistors (FinFETs) and silicon-on-insulator (SOI). The deep neural network 50 provides a framework for modeling more complex future semiconductor processes.

图8显示了使用转换的漏极电流作为深度神经网络的拟合目标解决器件建模时的过度拟合问题。测量数据204、206、208由输入预处理模块40预处理,并输入到深度神经网络50中,神经网络的输出通过损失函数42与转换的漏极电流进行比较,来产生最拟合测量数据204、206、208的模型参数。建模电流302是测量数据204、206、208的最佳拟合模型。Figure 8 shows using the transformed drain current as a fitting target for a deep neural network to solve the overfitting problem in device modeling. The measurement data 204, 206, 208 are preprocessed by the input preprocessing module 40 and input into the deep neural network 50, the output of the neural network is compared to the transformed drain current through a loss function 42 to generate the best fit measurement data 204 , 206, 208 model parameters. The modeled current 302 is the best fit model of the measurement data 204 , 206 , 208 .

测量数据206、208是两个异常数据点,其可能是某种错误或者测量误差的结果。在测量数据206、208的数据点上的漏极电流值,与其他测量数据204的漏极电流没有显著差异。但是,采用漏极电流的导数,增加了由损失函数42产生的误差。虽然数据点的电流值没有太大的差异,但连接数据点的线的斜率在测量数据206、208(参见图3线202)的较差拟合点附近有急剧跳跃。因此,采用转换漏极电流作为拟合目标能扩大由损失函数42产生的误差。正则化方法可能对漏极电流的较小误差作用不明显,但对通过求导方法增大的拟合目标值作用明显。The measurement data 206, 208 are two outlier data points, which may be the result of some error or measurement error. The drain current values at the data points of the measurement data 206 , 208 are not significantly different from the drain currents of the other measurement data 204 . However, taking the derivative of the drain current increases the error due to the loss function 42 . Although the current values of the data points do not differ much, the slope of the line connecting the data points has a sharp jump near the point of poor fit of the measured data 206, 208 (see line 202 in Figure 3). Therefore, using the converted drain current as the fitting target can enlarge the error produced by the loss function 42 . The regularization method may not have a significant effect on the small error of the drain current, but has a significant effect on the fitting target value increased by the derivation method.

尽管有异常测量数据206、208,但仍得到了曲线平滑的模型302。在异常数据点周围没有负电导。而且,超出测量数据204的建模电流302是可靠的。可扩展性很好。Despite the abnormal measurement data 206, 208, a curve-smoothed model 302 is obtained. There is no negative conductance around outlier data points. Also, the modeled current 302 beyond the measurement data 204 is reliable. Extensibility is good.

图9A-9B显示了使用转换漏极电流作为拟合目标,使得深度神经网络能够更好地模拟亚阈值区,图9A。Figures 9A-9B show that using the transformed drain current as a fitting target enables deep neural networks to better model the subthreshold region, Figure 9A.

在图9A,绘制了测量的漏极电流与漏源电压关系的函数。训练数据214和测试数据212是测量数据点,训练数据214被预处理并输入到深度神经网络50以生成权重,而测试数据212被目标转换模块44转换,并用于测试神经网络权重的精度。In Figure 9A, the measured drain current is plotted as a function of drain-source voltage. Training data 214 and test data 212 are measurement data points, training data 214 is preprocessed and input to deep neural network 50 to generate weights, and test data 212 is transformed by target transformation module 44 and used to test the accuracy of the neural network weights.

模型308是针对不同的栅极电压Vg而产生的。对图9A中的较大电流,模型308的精度是很好的,对于图9B中的较小电流,模型308的精度也是很好的。模型308在原点(0,0)收敛。模型308适用于亚阈值区的电压和电流。Model 308 is generated for different gate voltages Vg. The accuracy of the model 308 is good for the larger currents in Figure 9A, and also for the smaller currents in Figure 9B. Model 308 converges at the origin (0,0). Model 308 applies to voltages and currents in the subthreshold region.

通过以转换漏极电流而不是漏极电流本身为目标,可以导出包含更宽范围输入电压的模型308。在亚阈值区的模型更加准确且在原点处收敛。By targeting the conversion of the drain current rather than the drain current itself, a model 308 that covers a wider range of input voltages can be derived. Models in the subthreshold region are more accurate and converge at the origin.

图10显示一种晶体管仿真器,其是基于预处理输入运算并以转换漏极电流为拟合目标的深神经网络模型和参数深度神经网络50的输出同转换后的漏极电流最匹配时得到一组权重值,可将这组最终权重应用于深度神经网络50。仿真输入数据54包括在电路设计过程中由设计工程师输入的电压、温度、以及晶体管宽度和长度。仿真输入数据54由输入预处理模块40操作来预处理输入数据,以获得预处理输入X1、X2、X3、X4、...。例如,栅极和漏极电压的对数可以由输入预处理器40获得,并输入到深度神经网络50。FIG. 10 shows a transistor simulator based on a deep neural network model based on preprocessing input operations and with the conversion drain current as the fitting target and parameters obtained when the output of the deep neural network 50 best matches the converted drain current. A set of weight values to which the final set of weights can be applied to the deep neural network 50 . Simulation input data 54 includes voltage, temperature, and transistor width and length input by the design engineer during the circuit design process. The simulation input data 54 is operated by the input preprocessing module 40 to preprocess the input data to obtain preprocessed inputs X1, X2, X3, X4, . . . For example, the logarithm of the gate and drain voltages may be obtained by the input preprocessor 40 and input to the deep neural network 50 .

深度神经网络50根据预处理输入和最终权重而产生转换漏极电流X_Ids。反向目标转换器58执行的操作是目标转换器44(图6)的反向转换。例如,当目标转换器44产生漏极电流关于栅极电压的导数时,反向目标转换器58可以将转换漏极电流在栅极电压上积分。The deep neural network 50 generates the converted drain current X_Ids according to the preprocessing input and the final weights. The operation performed by inverse target converter 58 is the inverse conversion of target converter 44 (FIG. 6). For example, when target converter 44 produces the derivative of the drain current with respect to the gate voltage, inverse target converter 58 may integrate the converted drain current over the gate voltage.

反向目标转换器58可以使用黎曼和、牛顿-科特斯公式、或使用线性插值或多项式插值来进行积分。当目标转换模块44执行漏极电压的导数的对数时,反向目标转换模块58可以产生转换漏极电流的指数函数,然后进行积分。反向目标转换模块58产生由目标转换模块44转换之前的漏极电流值。这是仿真器60所预测的仿真漏极电流。The inverse target converter 58 may use Riemann sums, Newton-Cortes formulas, or use linear or polynomial interpolation for integration. When the target conversion module 44 performs the logarithm of the derivative of the drain voltage, the inverse target conversion module 58 may generate an exponential function that converts the drain current and then integrate. Reverse target conversion module 58 generates the drain current value prior to conversion by target conversion module 44 . This is the simulated drain current predicted by simulator 60 .

在仿真期间,不需要损失函数42和目标转换模块44,因为最终权重保持恒定且不被调整。仿真器60可以由输入预处理器40、深度神经网络50和反向目标转换器58构成。深度神经网络50的尺寸可以减小,例如通过删除权重值为0的权重的节点。当在仿真器60内部使用时,深神经网络50可以是正向神经网络。在更大规模的电路仿真器中,SPICE、BSIM或其他器件模型可以由模拟器60替换。对于每个仿真中的晶体管,上述更大的电路仿真器将产生输入电压,查找晶体管W和L,读取模拟的指定温度,然后调用仿真器60来仿真该晶体管的漏极电流。During the simulation, the loss function 42 and target transformation module 44 are not needed because the final weights remain constant and are not adjusted. The simulator 60 may consist of an input preprocessor 40 , a deep neural network 50 and an inverse object transformer 58 . The size of the deep neural network 50 can be reduced, for example, by removing nodes with a weight of zero. When used inside the simulator 60, the deep neural network 50 may be a forward neural network. In larger scale circuit simulators, SPICE, BSIM or other device models may be replaced by simulator 60 . For each transistor in the simulation, the larger circuit simulator described above will generate the input voltage, look up transistors W and L, read the specified temperature for the simulation, and then invoke the simulator 60 to simulate the drain current of that transistor.

通过使用有目标转换模块44和输入预处理模块40的深度神经网络50,可以显著降低研发器件模型的时间和所需人力。可以从使用新工艺制造的测试芯片上的器件获取测量数据,并将这些测量数据用作训练数据34(Vgs,Vds,...)和拟合目标数据38(Ids)。深层神经网络50可以正向和反向操作以调整权重,直到损失函数42找到可接受的低损失或最小值。然后,可以将最终权重应用于深度神经网络50并与损失函数42和反向目标转换模块58一起对新工艺制造的晶体管进行仿真。By using a deep neural network 50 with a target transformation module 44 and an input preprocessing module 40, the time and labor required to develop a device model can be significantly reduced. Measurement data can be obtained from devices on test chips fabricated using the new process and used as training data 34 (Vgs, Vds, . . . ) and fit target data 38 (Ids). The deep neural network 50 may operate forward and backward to adjust the weights until the loss function 42 finds an acceptable low loss or minimum value. The final weights can then be applied to the deep neural network 50 and simulated along with the loss function 42 and the inverse target conversion module 58 for transistors fabricated by the new process.

【其它实施方式】[Other Embodiments]

发明人还想到若干其他实施例。例如,目标转换器44、输入预处理器40和反向目标转换器58可以共享相同的计算硬件,或各自可以有专用硬件。转换漏极电流可以是漏极电流的导数、这些导数的对数、电导g(ds)、跨导g(m)、电导或跨导的对数、或漏极电流的其它转换。可以对这些转换中的几个进行测试,以找到一个最佳转换来用作产生最低损失函数的目标。类似地,输入预处理器40可以以各种方式预处理一些或全部输入。对数可以是自然对数、或基数为10的对数、或使用其他基数。转换或预处理函数的各种组合也可以被替换。Several other embodiments are also contemplated by the inventors. For example, object converter 44, input preprocessor 40, and inverse object converter 58 may share the same computing hardware, or each may have dedicated hardware. Transforming the drain current may be the derivative of the drain current, the logarithm of these derivatives, the conductance g(ds), the transconductance g(m), the logarithm of the conductance or transconductance, or other transformations of the drain current. Several of these transformations can be tested to find the best one to use as the objective that yields the lowest loss function. Similarly, input preprocessor 40 may preprocess some or all of the input in various ways. The logarithm can be the natural logarithm, or base 10 logarithm, or use other bases. Various combinations of transformation or preprocessing functions can also be substituted.

一些实施例可以不使用全部组件。可以添加其他组件。损失函数42可以使用各种误差/损失生成器,诸如在多个训练优化循环中防止权重增长过大的权重衰减项,鼓励节点将其权重归零的稀疏性惩罚,使得只有一小部分节点被有效地使用。剩余的这小部分节点是最相关的。虽然在原理中已经描述了各种损失和成本函数,但许多替代、组合和变化是可能的。其他变化和类型的损失或成本项可以添加到损失函数42。不同成本函数的相对缩放因子的值可以调整以平衡各个函数的影响。Some embodiments may not use all components. Additional components can be added. The loss function 42 may use various error/loss generators, such as a weight decay term that prevents weights from growing too large over multiple training optimization loops, a sparsity penalty that encourages nodes to zero their weights so that only a small fraction of nodes are Use effectively. The remaining small subset of nodes are the most relevant. While various loss and cost functions have been described in the principles, many alternatives, combinations and variations are possible. Other variations and types of loss or cost terms can be added to the loss function 42 . The values of the relative scaling factors for the different cost functions can be adjusted to balance the effects of the individual functions.

浮点值可以被转换为定点值或二进制值。尽管已经显示了二进制值权重,但可以使用各种编码,如二进制补码、霍夫曼编码、截断二进制编码等。表示权重值所需的二进制比特数目可以是指比特的数目,而不管编码方法,无论是二进制编码、灰码编码、定点、偏移量等。Floating-point values can be converted to fixed-point or binary values. Although binary-valued weights have been shown, various encodings can be used, such as two's complement, Huffman encoding, truncated binary encoding, etc. The number of binary bits required to represent the weight value may refer to the number of bits regardless of the encoding method, whether it is binary encoding, gray code encoding, fixed point, offset, or the like.

权重可以被限制在一个范围值内。尽管已经描述了-1到1的范围,但范围不一定必须包括0,例如512到1的范围。权重值可以被偏移以适合一个二进制范围,如范围为10511到10000的权重,其可以存储为一个9比特二进制字,在该二进制字上添加10000的偏移量以产生实际权重值。在优化期间可以调整范围。偏移量可以被存储或者可以硬件连线到深度神经网络50的逻辑中。Weights can be restricted to a range of values. Although a range of -1 to 1 has been described, the range does not necessarily have to include 0, such as a range of 512 to 1. The weight value can be offset to fit a binary range, such as a weight in the range 10511 to 10000, which can be stored as a 9-bit binary word to which an offset of 10000 is added to produce the actual weight value. The range can be adjusted during optimization. The offsets may be stored or may be hardwired into the logic of the deep neural network 50 .

对于运行深度神经网络50的训练例程,多种变化是可能的。优化可以首先确定隐藏或中间层节点的数量,然后继续优化权重。权重可以通过将一些权重归零以切断节点之间的连接来确定节点的布置或连接性。当结构被优化时,稀疏成本可以用于优化的初始循环,但是当权重值被微调时,稀疏成本不适用于后期的优化循环。可以使用S型函数来训练深度神经网络50内的隐藏层。查找表可以用来实现较复杂的函数而不是使用算术逻辑单元(ALU)以加速处理。每个节点的激活函数可能不同,例如sigmoid、tanh和relu。Numerous variations are possible for the training routine for running deep neural network 50 . The optimization can first determine the number of hidden or intermediate layer nodes, and then proceed to optimize the weights. Weights The placement or connectivity of nodes can be determined by zeroing some weights to sever connections between nodes. When the structure is optimized, the sparse cost can be used in the initial loop of optimization, but when the weight values are fine-tuned, the sparse cost is not applicable in the later optimization loop. Hidden layers within deep neural network 50 may be trained using a sigmoid function. Lookup tables can be used to implement more complex functions instead of using an arithmetic logic unit (ALU) to speed up processing. The activation function may be different for each node, such as sigmoid, tanh, and relu.

针对对不同的应用和训练数据损失降低的过程可能不相同。各式各样的结构、不同的数量和隐藏层布置可以用于深度神经网络50。每个节点的激活函数可能不同,例如sigmoid、tanh和relu。针对特定的应用和半导体工艺的建模,适合的神经网络可能是一些特定的神经网络,特定结构的深度神经网络50或者以通用神经网络50作为起点进行优化,。深神经网络50可以具有至少7个中间层,并具有至少一万个权重。The process of loss reduction may be different for different applications and training data. A wide variety of structures, different numbers, and hidden layer arrangements can be used for deep neural network 50 . The activation function may be different for each node, such as sigmoid, tanh, and relu. For specific applications and modeling of semiconductor processes, suitable neural networks may be some specific neural networks, a deep neural network 50 with a specific structure, or a general neural network 50 as a starting point for optimization. The deep neural network 50 may have at least 7 intermediate layers and have at least ten thousand weights.

Autoencoders、automax和softmax分类器以及其他种类的层可以插入到神经网络中。整个优化过程可以重复多次,例如针对不同的初始条件,如不同比特数的量化浮点值或其他参数、不同精度、不同缩放因子等。可以为各种条件组合设置终点,如期望的最终精度、精度-硬件成本积、目标硬件成本等。Autoencoders, automax and softmax classifiers, and other kinds of layers can be plugged into neural networks. The entire optimization process can be repeated multiple times, for example, for different initial conditions, such as quantized floating-point values or other parameters with different numbers of bits, different precisions, different scaling factors, etc. Endpoints can be set for various combinations of conditions, such as desired final accuracy, accuracy-hardware cost product, target hardware cost, etc.

虽然深度神经网络50的实际成本取决于多种因素,例如节点数、权重、互连、控制和接口,但发明人将成本近似为与权重总和成比例。用于表示深度神经网络50中所有权重的二进制比特总数是硬件成本的一个度量,即使仅仅是一个近似值。可以使用硬件复杂度成本梯度的梯度或斜率。在比较之前或之后,可以缩放和改变梯度值。While the actual cost of a deep neural network 50 depends on a variety of factors, such as number of nodes, weights, interconnects, controls, and interfaces, the inventors approximated the cost as proportional to the sum of the weights. The total number of binary bits used to represent all the weights in the deep neural network 50 is a measure of hardware cost, even if only an approximation. The gradient or slope of the hardware complexity cost gradient can be used. The gradient values can be scaled and changed before or after the comparison.

IC半导体制造工艺可能会有多种变化。光掩模可以用各种特殊机器和工艺,包括直接书写以烧掉金属化层而不是光致抗蚀剂。扩散、氧化物生长、蚀刻、沉积、离子注入以及其他制造步骤的多种组合可以使得在由光掩模控制的IC上产生其结果图案。尽管已经描述了建模晶体管,以及特别地建模了漏电流,但可以建模其他电流,例如二极管电流、衬底泄漏电流等,并可以用于其他器件如电容器、电阻器等的建模。IC semiconductor manufacturing processes may vary in many ways. Photomasks can be used with various special machines and processes, including direct writing to burn off metallization rather than photoresist. Various combinations of diffusion, oxide growth, etching, deposition, ion implantation, and other fabrication steps can produce the resulting pattern on an IC controlled by a photomask. Although modeling transistors has been described, and leakage currents in particular, other currents can be modeled, such as diode currents, substrate leakage currents, etc., and can be used for modeling of other devices such as capacitors, resistors, and the like.

深度神经网络50、损失函数42、目标转换模块44、反向目标转换模块58和其他部件可以使用软件、硬件、固件、程序、模块、函数等各种组合以各种技术来实现。最终产品,有最终权重的深度神经网络50以及输入预处理模块40和反向目标转换模块58,可以在专用集成电路(ASIC)或其他硬件中实施,以在仿真器60用于仿真大电路时提高处理速度并降低功耗。Deep neural network 50, loss function 42, target conversion module 44, inverse target conversion module 58, and other components may be implemented in various techniques using various combinations of software, hardware, firmware, programs, modules, functions, and the like. The final product, the deep neural network 50 with final weights as well as the input preprocessing module 40 and the inverse target transformation module 58, can be implemented in an application specific integrated circuit (ASIC) or other hardware for when the simulator 60 is used to simulate large circuits Increase processing speed and reduce power consumption.

本发明的背景部分可以包含有关本发明问题或环境的背景信息,而不是由其他人描述现有技术。因此,在背景部分中包含材料并不是申请人对现有技术的承认。The Background of the Invention section may contain background information about the problem or environment of the invention, rather than describing the prior art by others. Accordingly, the inclusion of material in the Background section is not an admission by the applicant of prior art.

在此所述的任何方法或过程是机器实施的或计算机实施的,并且旨在由机器、计算机或其它装置执行,不是没有这种机器辅助的情况下仅由人执行。所生成的有形结果可以包括报告或者在显示器设备(诸如计算机监视器、投影装置、音频生成装置和相关媒体装置)上的其它机器生成的显示,并且可以包括也是机器生成的硬拷贝打印输出。计算机控制其它机器是另一个有形结果。Any method or process described herein is machine-implemented or computer-implemented and is intended to be performed by a machine, computer or other device, not only by a human without such machine assistance. The generated tangible results may include reports or other machine-generated displays on display devices, such as computer monitors, projection devices, audio-generating devices, and related media devices, and may include hard-copy printouts that are also machine-generated. Computer control of other machines is another tangible result.

所述任何优点和益处可能不适用于本发明的所有实施例。当在权利要求要素中陈述用语“装置”(means)时,申请人意图使权利要求要素落入35USC第112章第6段的规定。在用语“装置”之前的一个或多个用语,是旨在便于对权利要求要素的引用,并且不旨在传达结构限制。这种装置加功能的权利要求旨在不仅覆盖这里描述的用于执行功能及其结构等同物的结构,而且覆盖等效结构。例如,虽然钉子和螺钉具有不同的构造,但是它们是等同的结构,因为它们都执行紧固的功能。不使用“装置”一词的权利要求不属于35USC第112章第6段的规定。信号通常是电信号,但可以是光信号,如可以通过光纤线路传送的信号。Any of the advantages and benefits described may not apply to all embodiments of the invention. When reciting the term "means" in a claim element, applicants intend for the claim element to fall within the provisions of 35 USC Chapter 112, paragraph 6. The term or terms preceding the term "means" is intended to facilitate reference to claim elements and is not intended to convey structural limitations. Such means-plus-function claims are intended to cover not only the structures described herein as performing the function and their structural equivalents, but also equivalent structures. For example, although nails and screws have different configurations, they are equivalent structures because they both perform the function of fastening. Claims that do not use the word "device" do not fall under 35 USC Chapter 112, paragraph 6. The signal is usually an electrical signal, but can be an optical signal, such as a signal that can be carried over fiber optic lines.

为了说明和描述,以上已经呈现了本发明实施例的描述。其并不旨在穷举或将本发明限制为所公开的精确形式。鉴于上述教导,许多修改和变化是可能的。旨在本发明的范围不受该详细描述的限制,而是由所附的权利要求限制。The foregoing descriptions of embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many modifications and variations are possible in light of the above teachings. It is intended that the scope of the invention be limited not by this detailed description, but by the appended claims.

Claims (18)

1.一种半导体器件建模系统,包括:1. A semiconductor device modeling system, comprising: 包含多个节点的深度神经网络,每个节点使用权重缩放其输入,产生的节点输出传输到所述多个节点中的其他节点;a deep neural network comprising a plurality of nodes, each node scaling its input using a weight, the resulting node output is transmitted to other nodes in the plurality of nodes; 输入模块,用于接收训练数据,所述训练数据表示晶体管的栅极电压和漏极电压,所述训练数据还包括所述晶体管的宽度和长度;an input module for receiving training data, the training data representing the gate voltage and the drain voltage of the transistor, the training data further including the width and length of the transistor; 输入预处理模块,其从所述输入模块接收所述训练数据,所述输入预处理模块将所述训练数据转换为预处理后的训练数据;an input preprocessing module that receives the training data from the input module, the input preprocessing module converts the training data into preprocessed training data; 其中所述预处理后的训练数据被应用于所述深度神经网络的输入;wherein the preprocessed training data is applied to the input of the deep neural network; 目标输入模块,其接收目标数据,所述目标数据表示,当所述训练数据表示的栅极电压和漏极电压被施加到所述晶体管时、测量的所述晶体管的漏极电流,而所述晶体管具有所述训练数据表示的晶体管宽度和晶体管长度;a target input module that receives target data representing the measured drain current of the transistor when the gate and drain voltages represented by the training data are applied to the transistor, and the a transistor having a transistor width and a transistor length represented by the training data; 目标转换模块,其从所述目标输入模块接收所述训练数据,所述目标转换模块将表示漏极电流的目标数据转换为一个转换漏极电流值;a target conversion module that receives the training data from the target input module, the target conversion module converts target data representing drain current into a converted drain current value; 损失函数产生模块,其根据所述预处理后的训练数据并使用所述权重,将所述目标转换模块产生的经过转换的漏极电流与所述深度神经网络产生的输出进行比较,其中所述损失函数产生模块通过调整所述神经网络的权重,将所述转换漏极电流值与所述深度神经网络产生的输出之间的损失函数值最小化;a loss function generation module that compares the transformed drain current generated by the target transformation module with the output generated by the deep neural network based on the preprocessed training data and using the weights, wherein the The loss function generation module minimizes the loss function value between the converted drain current value and the output generated by the deep neural network by adjusting the weight of the neural network; 其中在所述深度神经网络的训练期间输入并处理多个所述训练数据和多个所述目标数据,以产生多个权重和损失函数值;wherein a plurality of the training data and a plurality of the target data are input and processed during training of the deep neural network to generate a plurality of weights and loss function values; 其中在完成训练之后,选择最后一组权重,所述最后一组权重产生最小损失函数值;Wherein, after the training is completed, the last set of weights is selected, and the last set of weights generates the minimum loss function value; 其中所述最后一组权重界定了所述晶体管的器件模型,所述晶体管使用所述深度神经网络进行仿真;wherein the last set of weights defines a device model of the transistor that is simulated using the deep neural network; 其中所述目标转换模块将表示漏极电流的训练数据转换为所述漏极电流的导数;wherein the target conversion module converts the training data representing the drain current into a derivative of the drain current; 其中所述漏极电流的导数是所述转换漏极电流值,也是损失函数产生模块评估的所述深度神经网络的训练目标。The derivative of the drain current is the transformed drain current value, and is also the training target of the deep neural network evaluated by the loss function generation module. 2.根据权利要求1所述的半导体器件建模系统,其中所述漏极电流的导数是有关栅极电压的导数。2. The semiconductor device modeling system of claim 1, wherein the derivative of the drain current is a derivative with respect to the gate voltage. 3.根据权利要求1所述的半导体器件建模系统,其中所述漏极电流的导数是有关漏极电压的导数。3. The semiconductor device modeling system of claim 1, wherein the derivative of the drain current is a derivative with respect to the drain voltage. 4.根据权利要求1所述的半导体器件建模系统,其中所述漏极电流的导数是有关晶体管尺寸的导数。4. The semiconductor device modeling system of claim 1, wherein the derivative of the drain current is a derivative with respect to transistor size. 5.根据权利要求1所述的半导体器件建模系统,其中所述训练数据还包括温度:5. The semiconductor device modeling system of claim 1, wherein the training data further comprises temperature: 其中所述漏极电流的导数是有关所述温度的导数。wherein the derivative of the drain current is a derivative with respect to the temperature. 6.根据权利要求1所述的半导体器件建模系统,其中所述目标转换模块将表示漏极电流的训练数据转换为所述漏极电流的导数的对数;6. The semiconductor device modeling system of claim 1, wherein the target conversion module converts training data representing drain current into a logarithm of a derivative of the drain current; 其中所述漏极电流的导数的对数是所述转换漏极电流值,也是所述损失函数产生模块评估的所述深度神经网络的训练目标。wherein the logarithm of the derivative of the drain current is the transformed drain current value and is also the training target of the deep neural network evaluated by the loss function generation module. 7.根据权利要求1所述的半导体器件建模系统,其中在仿真过程中,将所述最后一组权重应用于所述深度神经网络,表示仿真电压的仿真训练数据由所述输入预处理模块进行预处理、继而由所述深度神经网络处理,通过使用最终的权重值产生仿真输出结果;7. The semiconductor device modeling system of claim 1 , wherein during simulation, the last set of weights is applied to the deep neural network, and simulation training data representing simulated voltages is provided by the input preprocessing module performing preprocessing and then processing by the deep neural network to generate a simulation output by using the final weight values; 反向目标转换模块,其在仿真期间从所述深度神经网络收到所述仿真输出,并产生漏极电流的仿真值,所述反向目标转换模块的操作是所述目标转换模块的逆转换操作;an inverse target conversion module that receives the simulation output from the deep neural network during simulation and generates a simulated value of drain current, the operation of the inverse target conversion module is the inverse conversion of the target conversion module operate; 由此,使用所述最后一组权重从所述深度神经网络产生漏极电流的仿真值。Thus, simulated values of drain current are generated from the deep neural network using the last set of weights. 8.根据权利要求7所述的半导体器件建模系统,其中所述反向目标转换模块包括积分模块,所述积分模块对所述深度神经网络的仿真输出在一定电压范围上进行积分,以生成所述漏极电流的仿真值。8. The semiconductor device modeling system of claim 7, wherein the inverse target conversion module includes an integration module that integrates the simulation output of the deep neural network over a range of voltages to generate The simulated value of the drain current. 9.根据权利要求1所述的半导体器件建模系统,其中所述输入预处理模块产生栅极电压的对数或漏极电压的对数,作为预处理训练数据作为所述深度神经网络的输入,9. The semiconductor device modeling system of claim 1, wherein the input preprocessing module generates a logarithm of gate voltage or a logarithm of drain voltage as preprocessing training data as input to the deep neural network , 由此,将对数电压输入应用到所述深度神经网络。Thus, a logarithmic voltage input is applied to the deep neural network. 10.根据权利要求1所述的半导体器件建模系统,其中所述输入预处理模块对所述训练数据执行主成分分析PCA,以选择主数据,其中所述主数据是应用于深度神经网络输入的预处理训练数据,10. The semiconductor device modeling system of claim 1, wherein the input preprocessing module performs a principal component analysis (PCA) on the training data to select master data, wherein the master data is applied to a deep neural network input The preprocessed training data, 由此,执行PCA后再应用于所述深度神经网络。Thus, PCA is performed and then applied to the deep neural network. 11.一种用于仿真模拟晶体管的计算机实施方法,包括:11. A computer-implemented method for simulating an analog transistor, comprising: 接收输入数据,所述输入数据表示通过电路仿真器施加到所述模拟晶体管的栅极电压和漏极电压,所述输入数据还包括所述模拟晶体管的晶体管宽度和晶体管长度;receiving input data representing gate and drain voltages applied to the analog transistor by a circuit simulator, the input data further including a transistor width and a transistor length of the analog transistor; 通过产生栅极电压的对数或漏极电压的对数,对所述输入数据进行预处理以产生预处理输入数据;preprocessing the input data to generate preprocessed input data by generating the logarithm of the gate voltage or the logarithm of the drain voltage; 将所述预处理输入数据作为输入应用于深度神经网络;applying the preprocessed input data as input to a deep neural network; 所述深度神经网包含一组权重,并将所述预处理输入数据输入到所述深度神经网络以产生一个神经网络输出;the deep neural network includes a set of weights and inputs the preprocessed input data to the deep neural network to produce a neural network output; 在栅极电压或漏极电压上对所述神经网络输出进行积分,以产生所述模拟晶体管的漏极电流值;integrating the neural network output over a gate voltage or a drain voltage to generate a drain current value for the analog transistor; 将所述漏极电流值输出到所述电路仿真器,所述电路仿真器使用所述漏极电流值来仿真电路中所述晶体管的操作;outputting the drain current value to the circuit simulator, the circuit simulator using the drain current value to simulate operation of the transistor in a circuit; 其中所述深度神经网络接收电压的对数,并在模拟期间产生所述漏极电流值的导数。wherein the deep neural network receives the logarithm of the voltage and generates a derivative of the drain current value during simulation. 12.根据权利要求11所述的计算机实施方法,其中所述深度神经网络有至少七个中间层,其中所述深度神经网络有至少一万个权重。12. The computer-implemented method of claim 11, wherein the deep neural network has at least seven intermediate layers, wherein the deep neural network has at least ten thousand weights. 13.根据权利要求11所述的计算机实施方法,其中所述神经网络输出是电导值。13. The computer-implemented method of claim 11, wherein the neural network output is a conductance value. 14.根据权利要求11所述的计算机实施方法,其中所述神经网络输出是跨导值。14. The computer-implemented method of claim 11, wherein the neural network output is a transconductance value. 15.一种用于存储计算机可执行指令的非暂时性计算机可读介质,所述计算机可执行指令在由处理器执行时实施一种方法,所述方法包括:15. A non-transitory computer-readable medium for storing computer-executable instructions that, when executed by a processor, implement a method, the method comprising: 将初始权重应用于神经网络中节点之间的连接,权重指定所述神经网络中节点之间连接的强度;applying initial weights to connections between nodes in a neural network, weights specifying the strength of connections between nodes in said neural network; 接收输入数据,所述输入数据表示施加到测试晶体管的栅极电压和漏极电压,所述输入数据还包括所述测试晶体管的晶体管宽度和晶体管长度;receiving input data representing gate and drain voltages applied to a test transistor, the input data further including a transistor width and a transistor length of the test transistor; 预处理所述输入数据以生成预处理输入数据;preprocessing the input data to generate preprocessed input data; 执行训练例程,将所述预处理输入数据输入到所述神经网络;executing a training routine to input the preprocessed input data to the neural network; 接收目标数据,所述目标数据表示在测试期间当所述栅极电压和所述漏极电压施加到所述测试晶体管时,在所述测试晶体管上测量到的漏极电流;receiving target data representing a drain current measured on the test transistor when the gate voltage and the drain voltage are applied to the test transistor during testing; 将所述目标数据转换为转换目标数据;converting the target data into conversion target data; 当所述预处理输入数据通过所述训练例程应用于所述神经网络的输入时,根据所述转换目标数据与所述神经网络输出的比较来产生一个损失函数;generating a loss function based on a comparison of the transformation target data with the neural network output when the preprocessed input data is applied to the neural network input by the training routine; 通过使用所述损失函数来调整所述初始权重以生成更新权重;to generate updated weights by adjusting the initial weights using the loss function; 当尚未达到目标终点时,将所述更新权重应用于所述神经网络,执行另一次迭代,并更新所述更新权重;When the target endpoint has not been reached, applying the update weights to the neural network, performing another iteration, and updating the update weights; 当达到所述目标终点时,输出所述神经网络和所述更新权重,作为所述测试晶体管的一个模型;When the target end point is reached, output the neural network and the update weight as a model of the test transistor; 其中预处理所述输入数据以产生预处理输入数据还包括:产生栅极电压的对数或产生漏极电压的对数;wherein preprocessing the input data to generate the preprocessing input data further comprises: generating the logarithm of the gate voltage or generating the logarithm of the drain voltage; 其中电压的对数被应用为所述神经网络的输入。where the logarithm of the voltage is applied as the input to the neural network. 16.根据权利要求15所述的非暂时性计算机可读介质,其中将所述目标数据转换为转换目标数据还包括:16. The non-transitory computer-readable medium of claim 15, wherein converting the target data to conversion target data further comprises: 产生漏极电流的导数作为所述转换目标数据。The derivative of the drain current is generated as the conversion target data. 17.一种用于存储计算机可执行指令的非暂时性计算机可读介质,所述计算机可执行指令在由处理器执行时实施一种方法,所述方法包括:17. A non-transitory computer-readable medium for storing computer-executable instructions that, when executed by a processor, implement a method, the method comprising: 将初始权重应用于神经网络中节点之间的连接,权重指定所述神经网络中节点之间连接的强度;applying initial weights to connections between nodes in a neural network, weights specifying the strength of connections between nodes in said neural network; 接收输入数据,所述输入数据表示施加到测试晶体管的栅极电压和漏极电压,所述输入数据还包括所述测试晶体管的晶体管宽度和晶体管长度;receiving input data representing gate and drain voltages applied to a test transistor, the input data further including a transistor width and a transistor length of the test transistor; 预处理所述输入数据以生成预处理输入数据;preprocessing the input data to generate preprocessed input data; 执行训练例程,将所述预处理输入数据输入到所述神经网络;executing a training routine to input the preprocessed input data to the neural network; 接收目标数据,所述目标数据表示在测试期间当所述栅极电压和所述漏极电压施加到所述测试晶体管时,在所述测试晶体管上测量到的漏极电流;receiving target data representing a drain current measured on the test transistor when the gate voltage and the drain voltage are applied to the test transistor during testing; 将所述目标数据转换为转换目标数据;converting the target data into conversion target data; 当所述预处理输入数据通过所述训练例程应用于所述神经网络的输入时,根据所述转换目标数据与所述神经网络输出的比较来产生一个损失函数;generating a loss function based on a comparison of the transformation target data with the neural network output when the preprocessed input data is applied to the neural network input by the training routine; 通过使用所述损失函数来调整所述初始权重以生成更新权重;to generate updated weights by adjusting the initial weights using the loss function; 当尚未达到目标终点时,将所述更新权重应用于所述神经网络,执行另一次迭代,并更新所述更新权重;When the target endpoint has not been reached, applying the update weights to the neural network, performing another iteration, and updating the update weights; 当达到所述目标终点时,输出所述神经网络和所述更新权重,作为所述测试晶体管的一个模型;When the target end point is reached, output the neural network and the update weight as a model of the test transistor; 其中将所述目标数据转换为转换目标数据还包括:The converting the target data into conversion target data further includes: 产生漏极电流的导数的对数作为所述转换目标数据。The logarithm of the derivative of the drain current is generated as the conversion target data. 18.一种用于存储计算机可执行指令的非暂时性计算机可读介质,所述计算机可执行指令在由处理器执行时实施一种方法,所述方法包括:18. A non-transitory computer-readable medium for storing computer-executable instructions that, when executed by a processor, implement a method, the method comprising: 将初始权重应用于神经网络中节点之间的连接,权重指定所述神经网络中节点之间连接的强度;applying initial weights to connections between nodes in a neural network, weights specifying the strength of connections between nodes in said neural network; 接收输入数据,所述输入数据表示施加到测试晶体管的栅极电压和漏极电压,所述输入数据还包括所述测试晶体管的晶体管宽度和晶体管长度;receiving input data representing gate and drain voltages applied to a test transistor, the input data further including a transistor width and a transistor length of the test transistor; 预处理所述输入数据以生成预处理输入数据;preprocessing the input data to generate preprocessed input data; 执行训练例程,将所述预处理输入数据输入到所述神经网络;executing a training routine to input the preprocessed input data to the neural network; 接收目标数据,所述目标数据表示在测试期间当所述栅极电压和所述漏极电压施加到所述测试晶体管时,在所述测试晶体管上测量到的漏极电流;receiving target data representing a drain current measured on the test transistor when the gate voltage and the drain voltage are applied to the test transistor during testing; 将所述目标数据转换为转换目标数据;converting the target data into conversion target data; 当所述预处理输入数据通过所述训练例程应用于所述神经网络的输入时,根据所述转换目标数据与所述神经网络输出的比较来产生一个损失函数;generating a loss function based on a comparison of the transformation target data with the neural network output when the preprocessed input data is applied to the neural network input by the training routine; 通过使用所述损失函数来调整所述初始权重以生成更新权重;to generate updated weights by adjusting the initial weights using the loss function; 当尚未达到目标终点时,将所述更新权重应用于所述神经网络,执行另一次迭代,并更新所述更新权重;When the target endpoint has not been reached, applying the update weights to the neural network, performing another iteration, and updating the update weights; 当达到所述目标终点时,输出所述神经网络和所述更新权重,作为所述测试晶体管的一个模型;When the target end point is reached, output the neural network and the update weight as a model of the test transistor; 其中所述方法还包括:The method further includes: 在已经达到所述目标终点之后;after said target endpoint has been reached; 将更新权重应用于所述神经网络;applying update weights to the neural network; 接收输入数据,所述输入数据表示施加到模拟晶体管的栅极电压和漏极电压,所述输入数据还包括所述模拟晶体管的晶体管宽度和晶体管长度;receiving input data representing gate and drain voltages applied to an analog transistor, the input data further including a transistor width and a transistor length of the analog transistor; 预处理所述输入数据以生成预处理输入数据;preprocessing the input data to generate preprocessed input data; 将所述预处理输入数据输入到所述神经网络,并使用所述神经网络和所述更新权重对所述预处理输入数据进行操作,以产生一个神经网络输出;inputting the preprocessed input data to the neural network and operating on the preprocessed input data using the neural network and the update weights to produce a neural network output; 对所述神经网络输出进行积分以产生所述模拟晶体管的模拟漏极电流。The neural network output is integrated to generate an analog drain current for the analog transistor.
CN201880001122.9A 2018-06-19 2018-06-20 Semiconductor device modeling for training deep neural networks using input preprocessing and transformation targets Active CN109791627B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US16/011,787 US11176447B2 (en) 2018-06-19 2018-06-19 Semiconductor device modeling using input pre-processing and transformed targets for training a deep neural network
US16/011,787 2018-06-19
PCT/CN2018/092038 WO2019241937A1 (en) 2018-06-19 2018-06-20 Semiconductor device modeling using input pre-processing and transformed targets for training a deep neural network

Publications (2)

Publication Number Publication Date
CN109791627A CN109791627A (en) 2019-05-21
CN109791627B true CN109791627B (en) 2022-10-21

Family

ID=66499472

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201880001122.9A Active CN109791627B (en) 2018-06-19 2018-06-20 Semiconductor device modeling for training deep neural networks using input preprocessing and transformation targets

Country Status (1)

Country Link
CN (1) CN109791627B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112710907B (en) * 2019-10-24 2023-07-07 苏州华太电子技术股份有限公司 Test method, test system and computer readable storage medium for power amplifier
US20220091568A1 (en) * 2020-09-21 2022-03-24 Shenzhen Keya Medical Technology Corporation Methods and devices for predicting physical parameter based on input physical information
CN112580288B (en) * 2020-12-03 2022-04-12 复旦大学 Semiconductor device characteristic modeling method and system based on multi-gradient neural network
CN116151174A (en) * 2023-04-14 2023-05-23 四川省华盾防务科技股份有限公司 General device model optimization method and system
CN119227618B (en) * 2024-11-28 2025-03-14 珠海格力电器股份有限公司 Model generation method, simulation device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663495A (en) * 2012-02-22 2012-09-12 天津大学 Neural net data generation method for nonlinear device modeling
CN103310285A (en) * 2013-06-17 2013-09-18 同济大学 Performance prediction method applicable to dynamic scheduling for semiconductor production line
CN103745273A (en) * 2014-01-06 2014-04-23 北京化工大学 Semiconductor fabrication process multi-performance prediction method
CN105138741A (en) * 2015-08-03 2015-12-09 重庆大学 Insulated gate bipolar transistor (IGBT) model parameter calibration system and method based on neural network
CN106446310A (en) * 2015-08-06 2017-02-22 新加坡国立大学 Transistor and system modeling methods based on artificial neural network
CN106777620A (en) * 2016-12-05 2017-05-31 天津工业大学 A kind of neutral net space reflection modeling method for power transistor
CN107748809A (en) * 2017-09-20 2018-03-02 苏州芯智瑞微电子有限公司 A kind of semiconductor devices modeling method based on nerual network technique

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6844582B2 (en) * 2002-05-10 2005-01-18 Matsushita Electric Industrial Co., Ltd. Semiconductor device and learning method thereof
US8935146B2 (en) * 2007-03-05 2015-01-13 Fujitsu Semiconductor Limited Computer aided design apparatus, computer aided design program, computer aided design method for a semiconductor device and method of manufacturing a semiconductor circuit based on characteristic value and simulation parameter
FR2983664B1 (en) * 2011-12-05 2013-12-20 Commissariat Energie Atomique ANALOG-DIGITAL CONVERTER AND NEUROMORPHIC CIRCUIT USING SUCH CONVERTER

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663495A (en) * 2012-02-22 2012-09-12 天津大学 Neural net data generation method for nonlinear device modeling
CN103310285A (en) * 2013-06-17 2013-09-18 同济大学 Performance prediction method applicable to dynamic scheduling for semiconductor production line
CN103745273A (en) * 2014-01-06 2014-04-23 北京化工大学 Semiconductor fabrication process multi-performance prediction method
CN105138741A (en) * 2015-08-03 2015-12-09 重庆大学 Insulated gate bipolar transistor (IGBT) model parameter calibration system and method based on neural network
CN106446310A (en) * 2015-08-06 2017-02-22 新加坡国立大学 Transistor and system modeling methods based on artificial neural network
CN106777620A (en) * 2016-12-05 2017-05-31 天津工业大学 A kind of neutral net space reflection modeling method for power transistor
CN107748809A (en) * 2017-09-20 2018-03-02 苏州芯智瑞微电子有限公司 A kind of semiconductor devices modeling method based on nerual network technique

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Artificial Neural Network Compact Model for TFTs";Quan CHEN等;《2016 7th International Conference on computer Aided-Design for Thin-Film Transitor Technologies》;20161028;第11页 *
"Artificial neural network design for compact modeling of generic transistors";Lining Zhang等;《J Comput Electron》;20170409;第1-5页 *

Also Published As

Publication number Publication date
CN109791627A (en) 2019-05-21

Similar Documents

Publication Publication Date Title
US11176447B2 (en) Semiconductor device modeling using input pre-processing and transformed targets for training a deep neural network
CN109791627B (en) Semiconductor device modeling for training deep neural networks using input preprocessing and transformation targets
US11537841B2 (en) System and method for compact neural network modeling of transistors
US20190138897A1 (en) System and method for circuit simulation based on recurrent neural networks
US7409651B2 (en) Automated migration of analog and mixed-signal VLSI design
US12223246B2 (en) Systems, methods, and computer program products for transistor compact modeling using artificial neural networks
US9898566B2 (en) Method for automated assistance to design nonlinear analog circuit with transient solver
JP2008523516A (en) Stochastic analysis process optimization for integrated circuit design and manufacturing
Zhao et al. Efficient performance modeling for automated CMOS analog circuit synthesis
US7730433B2 (en) Analog design retargeting
US7979261B2 (en) Circuit simulation model generation apparatus, circuit simulation model generation method and circuit simulation apparatus
Chavez et al. Deep learning-based IV global parameter extraction for BSIM-CMG
Yang et al. Graph-based compact model (GCM) for efficient transistor parameter extraction: A machine learning approach on 12 nm FinFETs
US20080177517A1 (en) Techniques for calculating circuit block delay and transition times including transistor gate capacitance loading effects
WO2012081158A1 (en) Circuit simulation method and semiconductor integrated circuit
Viraraghavan et al. Statistical compact model extraction: A neural network approach
Husain et al. Gaussian process regression for small-signal modelling of GaN HEMTs
JP2007310873A (en) Parameter extraction method and computer-readable storage medium having program for executing parameter extraction method
Doronina et al. Autonomic closure for turbulent flows using Approximate Bayesian Computation
Papageorgiou et al. Mosfet model parameter extraction using reinforcement learning
JPWO2006107025A1 (en) Parameter adjustment device
Zhu et al. An enhanced Neuro-Space mapping method for nonlinear microwave device modeling
Dhabak et al. Adaptive sampling algorithm for ANN-based performance modeling of nano-scale CMOS inverter
JP2023082459A (en) Model creation method and program
Avci Neural network-based design approach for submicron MOS integrated circuits

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant