[go: up one dir, main page]

CN1457471A - Two architectures for integrated realization of sensing and processing in a single device - Google Patents

Two architectures for integrated realization of sensing and processing in a single device Download PDF

Info

Publication number
CN1457471A
CN1457471A CN00807117A CN00807117A CN1457471A CN 1457471 A CN1457471 A CN 1457471A CN 00807117 A CN00807117 A CN 00807117A CN 00807117 A CN00807117 A CN 00807117A CN 1457471 A CN1457471 A CN 1457471A
Authority
CN
China
Prior art keywords
array
sensing
output
input
logic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN00807117A
Other languages
Chinese (zh)
Inventor
G·埃尔滕
F·M·萨拉姆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Clarity LLC
Original Assignee
Clarity LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Clarity LLC filed Critical Clarity LLC
Publication of CN1457471A publication Critical patent/CN1457471A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Neurology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Amplifiers (AREA)
  • Transforming Light Signals Into Electric Signals (AREA)
  • Semiconductor Integrated Circuits (AREA)

Abstract

An integrated sensing device comprising an array of sensor processor cells capable of being arranged into a detection array: Each sensor processor cell comprises a sensing medium; at least one transconductance amplifier configured for feedforward template multiplication; at least one transconductance amplifier configured for feedback template weights; a plurality of local dynamic memory cells; a data bus for data transfer; and a local logic unit. The array of sensor processor cells, by responding to data control signals, is capable of transforming, reshaping, and modulating the original sensed image into varied represenations which include (and extend) traditional spatial and temporal processing transformations.

Description

在单个器件中集成实现传感和处理的二种构造Integrating two architectures for sensing and processing in a single device

发明的领域field of invention

本发明涉及到集成的传感和处理器件。更具体地说,本发明涉及到在单个器件中集成实现传感和处理的构造。The present invention relates to integrated sensing and processing devices. More specifically, the present invention relates to architectures that integrate sensing and processing in a single device.

背景background

目前,蜂窝神经网络(CNN),基于明确表达在(通常是二维的)网格中各个单元之间一组非线性动态相互作用的一组标准微分方程(下面的方程1-3),建立了一种范例。在以反馈和前馈权重(亦即方程1中的数值A和B)和状态起始条件(方程1中的x(t=0))以及偏置电流(方程1中的I)的形式设定这些相互作用的过程中,存在着一些自由度。实际上,借助于控制这些数值而对CNN进行编程。非线性来自于功能(方程2中的“f”)的非线性本性。功能“f”可以有不同的形状和特性。对一个特定的功能进行编程于是被用来找到A、B、x(t=0)、I的正确组合,以产生所希望的输出(y),从而给出输入形式(u)。这种类型的编程是相当灵活的。Currently, cellular neural networks (CNNs), based on a set of standard differential equations (Equations 1-3 below) that explicitly express a set of nonlinear dynamic interactions among individual cells in a (usually two-dimensional) an example. Setting There are some degrees of freedom in determining these interactions. In effect, the CNN is programmed by controlling these values. The nonlinearity comes from the nonlinear nature of the function ("f" in Equation 2). Function "f" can have different shapes and properties. Programming a specific function is then used to find the correct combination of A, B, x (t=0), I to produce the desired output (y), giving the input form (u). This type of programming is quite flexible.

CNN是蜂窝自动装置与神经网络的混合(因此称为蜂窝神经网络),它结合了二种情况的最佳特点。如神经网络那样,其时间连续特点使得能够进行实时信号处理,而如蜂窝自动装置那样,其局部互连特点使得能够在VLSI中得到实际实现。其网格状结构适合于在线实时求解一阶非线性微分方程的高阶方程组。总之,CNN可以被认为是一种模拟非线性动态处理器阵列。CNN的基本单位被称为单元。各个单元从其最近邻(以及经由反馈从其本身)以及从外部源(例如传感器阵列各点和/或先前各层)接受输入。A CNN is a hybrid of a cellular robot and a neural network (hence the name cellular neural network), which combines the best features of both worlds. Like neural networks, their time-continuous nature enables real-time signal processing, while their locally interconnected nature, like cellular automatons, enables practical implementation in VLSI. Its grid-like structure is suitable for solving high-order equations of first-order nonlinear differential equations online and in real time. In summary, a CNN can be thought of as an array of simulated nonlinear dynamic processors. The basic unit of a CNN is called a cell. Each cell accepts input from its nearest neighbors (and from itself via feedback) as well as from external sources such as sensor array points and/or previous layers.

标准CNN方程总括有下列关系: τ ij x · ij = - x ij ( l ) + Σ kl ∈ N r ( ij ) A ij ; kl ( y kl ( l ) , y ij ( l ) ) + Σ kl ∈ N r ( ij ) B ij ; kl ( u kl ( l ) , u ij ( l ) ) + I ij The standard CNN equation sums up the following relationship: τ ij x &Center Dot; ij = - x ij ( l ) + Σ kl ∈ N r ( ij ) A ij ; kl ( the y kl ( l ) , the y ij ( l ) ) + Σ kl ∈ N r ( ij ) B ij ; kl ( u kl ( l ) , u ij ( l ) ) + I ij

      公式1 Formula 1

yij=f(xij)y ij = f(x ij )

      公式2Formula 2

和Iij=Iand I ij =I

      公式3Formula 3

其中,u表示输入,x表示状态。Y表示与单元(或神经原)相关的状态的非线性函数,而A和B表示克隆的样板。Among them, u represents the input and x represents the state. Y represents a non-linear function of the state associated with the unit (or neuron), while A and B represent templates for the clone.

在典型的CNN中,近邻之间的局部连接(反馈权重,即方程1中矩阵A的输入),与形成传感器阵列的连接(输入权重,即方程1中矩阵B的输入)一起,形成可编程克隆样板。已经发展了用来执行各种视觉处理任务的克隆样板。各个样板有特定的应用,例如平面探测或双目立体显微镜的克隆样板。蜂窝神经网络由于其可编程性而在映象处理中具有吸引力。人们仅仅需要改变样板就可以执行不同的图象任务。In a typical CNN, the local connections between neighbors (feedback weights, input to matrix A in Equation 1), together with the connections forming the sensor array (input weights, input to matrix B in Equation 1), form a programmable Clone template. Cloning templates have been developed to perform various vision processing tasks. Each template has a specific application, such as a cloning template for planar detection or a binocular stereomicroscope. Cellular neural networks are attractive in image processing due to their programmability. One only needs to change the template to perform different image tasks.

尽管有这种灵活性,当试图在电路中实现CNN时,仍然出现严重的问题。然而,方程(1-3)所述的CNN模型不适合于直接实现VLSI。在一种CNN模型的集成电路实现中,求和方程是如图1中电路模型所假设的基于电流的计算。根据克希霍夫定律,所有进入确定单元(x)状态的节点的电流必须加起来为0。由于本征电阻值(R)非常大而电容值非常小,故为了保持特定的电压所需要的电荷很少。这也意味着为了改变状态的电压值所需要的电流相对地很小。噪声电流值的幅度足以造成明显的差别。当再考虑到同一个芯片衬底上晶体管特性一般能够变化高达20%这样的事实时,事情就显而易见,即(x)节点极有可能被上下充电到电源线。一种补救办法是对各个节点增加足够的电容。由于需要宝贵的VLSI资源且进一步增加单元的响应时间,故这是不可取的。目前的CNN模型的VLSI实现未曾论述这一问题,因此,方程(1-3)所述的目前的CNN模型不适合于直接VLSI实现。Despite this flexibility, serious problems still arise when trying to implement CNNs in circuits. However, the CNN model described by equations (1-3) is not suitable for direct implementation of VLSI. In an integrated circuit implementation of a CNN model, the summation equation is a current-based calculation as assumed by the circuit model in Figure 1. According to Kirchhoff's law, the currents of all nodes entering the state of a certain cell (x) must add up to zero. Since the intrinsic resistance (R) is very large and the capacitance is very small, very little charge is required to maintain a specific voltage. This also means that the current required to change the voltage level of the state is relatively small. The magnitude of the noise current value is large enough to make a noticeable difference. When one considers the fact that transistor characteristics can typically vary by up to 20% on the same chip substrate, it becomes apparent that the (x) node is very likely to be charged up and down the supply rail. One remedy is to add sufficient capacitance to each node. This is undesirable since it requires precious VLSI resources and further increases the response time of the unit. The current VLSI implementation of the CNN model does not address this issue, therefore, the current CNN model described by equations (1-3) is not suitable for direct VLSI implementation.

发明概述Summary of the invention

一种集成的传感器件,它包含由能够安排成探测阵列的传感处理器单元组成的阵列。各个传感处理器单元包含传感媒质;至少一个用来前馈样板倍增的跨导放大器;至少一个用于反馈样板权重的跨导放大器;多个局部动态存储器单元;用于数据传送的数据总线;以及局部逻辑单元。传感处理器单元组成的阵列,借助于响应数据控制信号,能够将原先传感的图象转换、整形、并调制成包括(和扩充)传统空间和暂时处理转换的各种表征。An integrated sensor device comprising an array of sensor processor units that can be arranged in a detection array. Each sensor processor unit includes a sensor medium; at least one transconductance amplifier for feed-forward template multiplication; at least one transconductance amplifier for feedback template weighting; multiple local dynamic memory units; a data bus for data transfer ; and the local logic unit. An array of sensor processor elements is capable of transforming, shaping, and modulating a previously sensed image into representations including (and augmenting) conventional spatial and temporal processing transformations by responding to data control signals.

附图的简要说明Brief description of the drawings

在下列附图中,用举例的方法来说明本发明,在这些附图中,用相似的参考号来表示相似的元件。下列附图公开了本发明的各种实施方案,其目的仅仅是为了进行说明而不是用来限制本发明的范围。The invention is illustrated by way of example in the following drawings, in which like reference numerals are used to designate like elements. The following drawings disclose various embodiments of the present invention, which are for the purpose of illustration only and are not intended to limit the scope of the invention.

图1(现有技术)示出了实际实现蜂窝神经网络标准方程的构造的现有实施方案。Figure 1 (Prior Art) shows a prior implementation that actually implements the construction of the standard equations for cellular neural networks.

图2示出了根据本发明一个实施方案所述对VLSI版图采用动态蜂窝计算模型的实施方案。FIG. 2 illustrates an implementation of a dynamic cellular computing model for a VLSI layout according to an embodiment of the present invention.

图3示出了根据本发明一个实施方案所述对VLSI版图采用动态蜂窝计算模型的变通实施方案。FIG. 3 shows an alternative implementation of using a dynamic cellular computing model for a VLSI layout according to an embodiment of the present invention.

图4示出了根据本发明一个实施方案所述对使用模拟倍增器的VLSI版图采用动态蜂窝计算模型的变通实施方案。Figure 4 illustrates an alternative implementation of the dynamic cellular computing model for a VLSI layout using analog multipliers, according to one embodiment of the present invention.

图5以流程形式式示出了根据本发明一个实施方案所述的动态蜂窝构造的操作过程的实施方案。Fig. 5 shows an embodiment of the operation process of the dynamic cellular structure according to one embodiment of the present invention in the form of a flow chart.

图6示出了根据本发明一个实施方案所述的蜂窝构造元件即单元的实施方案。Figure 6 shows an embodiment of a honeycomb construction element or cell according to one embodiment of the invention.

图7示出了CMOS芯片的布局的实施方案,此CMOS芯片包含根据本发明一个实施方案所述的几种蜂窝单元。Figure 7 shows an embodiment of the layout of a CMOS chip containing several cells according to one embodiment of the present invention.

图8示出了根据本发明一个实施方案所述的初始化的1输入1输出单元的实施方案。Figure 8 shows an embodiment of an initialized 1-input-1-output unit according to an embodiment of the present invention.

图9示出了根据本发明一个实施方案所述的可编程类型反馈单元的实施方案。Figure 9 shows an embodiment of a programmable type feedback unit according to an embodiment of the present invention.

图10示出了根据本发明一个实施方案所述的正负反馈硬连接的状态可初始化的单元的实施方案。FIG. 10 shows an embodiment of a state-initializable unit with positive and negative feedback hard-wired according to an embodiment of the present invention.

图11(a)示出了根据本发明一个实施方案所述的二个(+/-)硬连接反馈的浮置状态单元的实施方案。Figure 11(a) shows an embodiment of a floating state cell with two (+/-) hardwired feedbacks according to one embodiment of the present invention.

图11(b)示出了根据本发明一个实施方案所述的仅仅有一个前馈的单元的实施方案。Figure 11(b) shows an embodiment of only one feed-forward unit according to one embodiment of the present invention.

图12示出了根据本发明一个实施方案所述的经由并联晶体管级联的数字输入设备的实施方案。Figure 12 shows an embodiment of digital input devices cascaded via parallel transistors according to an embodiment of the present invention.

图13示出了根据本发明一个实施方案所述的前馈单元的实施方案。Figure 13 shows an embodiment of a feedforward unit according to an embodiment of the present invention.

图14示出了根据本发明一个实施方案所述的由不具有反馈连接的3输入1输出的单元得到的实施方案。Figure 14 shows an embodiment resulting from a 3-input 1-output unit without feedback connections according to one embodiment of the invention.

图15示出了根据本发明一个实施方案所述的双重分布蜂窝构造的实施方案。Figure 15 shows an embodiment of a dual distributed cellular configuration according to an embodiment of the present invention.

图16示出了根据本发明一个实施方案所述的有正负号的传感器的布局的实施方案。Figure 16 shows an embodiment of a layout of a signed sensor according to an embodiment of the invention.

图17示出了根据本发明一个实施方案所述的双重输出光传感象素的实施方案。Figure 17 illustrates an embodiment of a dual output light sensing pixel according to an embodiment of the present invention.

图18示出了根据本发明一个实施方案所述的可编程卷积阵列(PCA)的单元示意图的实施方案。Figure 18 shows an embodiment of a cell schematic of a programmable convolution array (PCA) according to an embodiment of the present invention.

图19示出了根据本发明一个实施方案所述的可编程卷积阵列(PCA)的单个单元的布局的实施方案。Figure 19 shows an embodiment of the layout of a single cell of a programmable convolution array (PCA) according to one embodiment of the present invention.

图20示出了根据本发明一个实施方案所述的可编程卷积阵列(PCA)元件的输出的实施方案。Figure 20 shows an embodiment of the output of a programmable convolution array (PCA) element according to one embodiment of the present invention.

图21示出了根据本发明一个实施方案所述的具有I/O焊点的5×5可编程卷积阵列(PCA)的实施方案。Figure 21 shows an embodiment of a 5x5 programmable convolution array (PCA) with I/O pads according to one embodiment of the invention.

图22以流程形式式示出了根据本发明一个实施方案所述的双重分布结构的可编程卷积阵列(PCA)的操作过程的实施方案。FIG. 22 shows an embodiment of the operation process of a programmable convolution array (PCA) with a dual distribution structure according to an embodiment of the present invention in the form of a flow chart.

图23示出了根据本发明一个实施方案所述的可编程蜂窝逻辑阵列(PCLA)的元件的实施方案。Figure 23 shows an embodiment of elements of a programmable cellular logic array (PCLA) according to one embodiment of the invention.

图24以流程形式式示出了根据本发明一个实施方案所述的双重分布构造的可编程蜂窝逻辑阵列(PCLA)的操作过程的实施方案。Figure 24 shows in flow chart form an embodiment of the operation process of a programmable cellular logic array (PCLA) in a dual distribution configuration according to one embodiment of the present invention.

图25以流程形式式示出了根据本发明一个实施方案所述的协同使用可编程卷积阵列(PCA)与双重分布构造的可编程蜂窝逻辑阵列(PCLA)的操作过程的实施方案。FIG. 25 shows in flowchart form an embodiment of the operation of using a Programmable Convolution Array (PCA) in conjunction with a Programmable Cellular Logic Array (PCLA) in dual distribution configuration according to one embodiment of the present invention.

图26示出了根据本发明一个实施方案所述的可在其中实现本发明的计算环境的实施方案。Figure 26 illustrates an embodiment of a computing environment in which the present invention may be implemented, according to one embodiment of the present invention.

图27示出了根据本发明一个实施方案所述的可在其中实现本发明的网络环境的实施方案。Figure 27 illustrates an embodiment of a network environment in which the present invention may be implemented, according to one embodiment of the present invention.

详细描述A detailed description

下列详细描述提出了大量具体细节来提供对本发明的透彻理解。但本技术领域的一般熟练人员可以理解的是,可以不用这些具体的细节来实施本发明。在其它的情况下,为了不使本发明难以理解,没有描述众所周知的方法、过程、协议、部件、算法、和电路。The following detailed description presents numerous specific details in order to provide a thorough understanding of the present invention. However, it will be understood by one of ordinary skill in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, protocols, components, algorithms, and circuits have not been described in order not to obscure the present invention.

在一个实施方案中,本发明的各个步骤或过程体现在诸如计算机指令之类的机器可执行的指令中。这些指令可以被用来使被指令编程了的通用或专用处理器执行本发明的各个步骤。作为变通,本发明的各个步骤可以由包含用来执行各个步骤的硬连接逻辑的专用硬件部件,或由编程计算机部件和专用硬件部件的任何组合来执行。In one embodiment, the various steps or processes of the present invention are embodied in machine-executable instructions, such as computer instructions. These instructions can be used to cause a general or special purpose processor programmed with the instructions to perform the various steps of the present invention. Alternatively, the various steps of the invention may be performed by dedicated hardware components containing hard-wired logic to perform the individual steps, or by any combination of programmed computer components and dedicated hardware components.

本发明的一般目的是一种能够利用电路和集成的微电子芯片成功地实现的新的蜂窝网络构造,以及一种新的分布构造,此分布构造包含二个分立的阵列,亦即可编程卷积阵列PCA和可编程逻辑阵列PLA,用来处理一维、二维、或三维阵列(实测)传感器数据。The general object of the present invention is a new cellular network architecture that can be successfully implemented using circuits and integrated microelectronic chips, and a new distributed architecture comprising two separate arrays, i.e. programmable volumes Product array PCA and programmable logic array PLA are used to process one-dimensional, two-dimensional, or three-dimensional array (measured) sensor data.

动态蜂窝构造dynamic cellular structure

本发明的一个实施方案引入了一种用来计算输入和前馈样板的自归一化反馈结构。同样,本发明还将相同的结构引入到构造的反馈部分。这些修正产生了一组新的方程,并相应地产生了一组新的动力学,从而代表一种新的结构、形式、和构造。前馈输入权重倍增过程中的反馈,以及反馈权重倍增中的反馈,被用来使状态节点(x)避免迅速积累或失去电荷以致饱和到电源线。One embodiment of the present invention introduces a self-normalizing feedback structure for computing inputs and feed-forward templates. Also, the present invention introduces the same structure into the feedback part of the construction. These corrections yield a new set of equations and, in turn, a new set of dynamics, representing a new structure, form, and configuration. Feedback during feed-forward input weight multiplication, and feedback during feedback weight multiplication, are used to keep the state node (x) from accumulating or losing charge rapidly enough to saturate the power rail.

图2示出了在电路中实现的本发明的一个实施方案。图2(a)示出了自调整机制的一个实施方案,其中不是将输入倍增B(如现有技术那样),而是每个节点确定输入减去状态的函数并将此量倍增B.这样,放大器周围的稳定和自定比例反馈元件就使放大器的输出不饱和。放大器的增益被控制,以便得到一个作为B量化的(归一化)形式的因子。以这种方式,输出始终是输入乘以“B”的归一化形式,且来自邻近单元的多个(集合)输入将不引起单元聚集状态或输出饱和或被输入的稍许起伏过驱动。图2(b)示出了用图2(a)所示概念对VLSI版图采用动态蜂窝计算模型的实施方案。在一个实施方案中,假设所有的输入权重(即所有B权重的输入)是非负的,且所有反馈权重(所有A的输入)是非正的。Figure 2 shows an embodiment of the invention implemented in a circuit. Figure 2(a) shows an embodiment of a self-adjusting mechanism, where instead of multiplying the input by B (as in the prior art), each node determines a function of the input minus the state and multiplies this amount by B. Thus , a stable and self-ratioing feedback element around the amplifier does not saturate the output of the amplifier. The gain of the amplifier is controlled so as to obtain a factor which is the quantized (normalized) form of B. In this way, the output is always a normalized form of the input multiplied by "B", and multiple (aggregate) inputs from neighboring cells will not cause the cell to bunch up or the output to saturate or be overdriven by slight fluctuations in the input. Figure 2(b) shows an implementation of a dynamic cellular computing model for a VLSI layout using the concept shown in Figure 2(a). In one embodiment, it is assumed that all input weights (ie, all B-weight inputs) are non-negative, and all feedback weights (all A-weight inputs) are non-positive.

如图2的实施方案所示,在图2(b)所示的电路实现中:(1)u(无正负号的)和x(有正负号的)二者都被表示为模拟电压u和x;(2)样板权重可以被数字化存储并被编程;以及(3)诸如方程2所示的传送函数f(),实际上可以成为可编程的三级饱和函数f’(),它在输入的局部区域中(而不是仅仅在一点)产生0输出。As shown in the embodiment of Figure 2, in the circuit implementation shown in Figure 2(b): (1) u (unsigned) and x (signed) are both represented as analog voltages u and x; (2) template weights can be digitally stored and programmed; and (3) a transfer function f() such as that shown in Eq. Produces 0 output in a local region of the input (rather than just at a point).

组合图2电路结构特性的特有方程列举如下。 C x · ij ( t ) = - x ij ( t ) R + Σ kl ∈ N r ( ij ) A kl y kl ( t ) + Σ kl ∈ N r ( ij ) B kl tanh [ qK 2 kT ( u kl ( t ) - x ij ( t ) ) ] + I ij The unique equations combining the characteristics of the circuit structure of Fig. 2 are listed below. C x &Center Dot; ij ( t ) = - x ij ( t ) R + Σ kl ∈ N r ( ij ) A kl the y kl ( t ) + Σ kl ∈ N r ( ij ) B kl tanh [ QUR 2 kT ( u kl ( t ) - x ij ( t ) ) ] + I ij

    公式4 y = f ( x ) = tanh ( qK 2 kT ( x - V REF ) ) Formula 4 the y = f ( x ) = tanh ( QUR 2 kT ( x - V REF ) )

    公式5Formula 5

和Iij=Iand I ij =I

    公式6Formula 6

其中双曲正切(tanh)描述了一种基本跨导电路。而且,需要从单元初始条件xij(0),对微分方程进行初始化。where the hyperbolic tangent (tanh) describes a basic transconductance circuit. Furthermore, the differential equation needs to be initialized from the unit initial condition x ij (0).

我们观察到电容器C和电阻器R可能是由微电子实现过程的寄生现象引起的。但在希望有大于寄生电容的电容以便降低或修整样板过程的情况下,则可以在芯片上实现约为pf的电容。可以在芯片外实现任何更大的电容,但这会带来可能随之发生的要限制蜂窝阵列尺寸的成本。We observed that capacitor C and resistor R may be caused by parasitics of the microelectronics implementation process. But in cases where a capacitance larger than the parasitic capacitance is desired to reduce or trim the prototyping process, then a capacitance of about pf can be implemented on-chip. Any larger capacitance could be implemented off-chip, but with the possible attendant cost of limiting the size of the cellular array.

在一个实施方案中,在负反馈结构中实现了输入(前馈)权重(B)。因此,在这一实现中,反馈结构稳定了状态(x)节点,否则具有总是上下充电到电源线的倾向,且此结构对于产生诸如晶体管不匹配之类问题来说具有很大的弹性。在图2的实施方案中,示出了修正的动态蜂窝构造模型,其中已经假设了输入和权重的极性,例如B是非负的而A是非正的。In one embodiment, the input (feedforward) weights (B) are implemented in a negative feedback structure. Thus, in this implementation, the feedback structure stabilizes the state (x) node, which otherwise has a tendency to charge up and down the power rail all the time, and this structure is very resilient to problems such as transistor mismatch. In the embodiment of Fig. 2, a modified dynamic cellular model is shown where the polarity of the inputs and weights have been assumed, eg B is non-negative and A is non-positive.

图3示出了一个变通实施方案,说明了增强实现范围以及也增强状态贡献的修正。因此,特定的方程表示可以在集成的微电子芯片中实现的电路构造: C x · ij ( t ) = - x ij ( t ) R + A ij y ij ( t ) + Σ kl ∈ N ru ( ij ) A ij y kl ( t ) + Σ kl ∈ N r ( ij ) B ij tanh [ qK 2 kT ( u kl ( t ) - x ij ( t ) ) ] + I ij 公式7 y ij = f ( x ij ) = tanh ( qK 2 kT ( V REF - x ij ) ) 公式8 y kl = f ( x kl ) = tanh ( qK 2 kT ( x jl - x ij ) ) Figure 3 shows an alternative implementation illustrating the enhancement of the scope of the implementation and also the modification of the contribution of the enhancement state. Therefore, specific equations represent the circuit configurations that can be implemented in an integrated microelectronic chip: C x · ij ( t ) = - x ij ( t ) R + A ij the y ij ( t ) + Σ kl ∈ N the ( ij ) A ij the y kl ( t ) + Σ kl ∈ N r ( ij ) B ij tanh [ QUR 2 kT ( u kl ( t ) - x ij ( t ) ) ] + I ij Formula 7 the y ij = f ( x ij ) = tanh ( QUR 2 kT ( V REF - x ij ) ) Formula 8 the y kl = f ( x kl ) = tanh ( QUR 2 kT ( x jl - x ij ) )

I ij = tanh ( qk 2 kT ( V exl - V ref ) ) 公式9and I ij = tanh ( qk 2 kT ( V exl - V ref ) ) Formula 9

其中双曲正切(tanh)表示一种基本(或宽频带)跨导放大器电路。Among them, hyperbolic tangent (tanh) represents a basic (or broadband) transconductance amplifier circuit.

在一个实施方案中,对于反馈项A,下标kl在单元位置ij附近溢出,同时,下标kl在与B参数相关的前馈项中的位置ij附近溢出。如前所述,输入(前馈)权重(B)在负状态反馈结构中被实现。因此,在这一实现中,反馈结构稳定了状态(x)节点,否则具有总是上下充电到电源线的倾向,且此结构对于产生诸如晶体管不匹配之类问题来说具有很大的弹性。于是,图3示出了这一修正的动态蜂窝构造模型,其中已经假设了输入和权重的极性,例如各个B和A元件是非负的。In one embodiment, for the feedback term A, the subscript kl overflows near cell location ij, while the subscript kl overflows near location ij in the feedforward term associated with the B parameter. The input (feed-forward) weights (B) are implemented in a negative state feedback structure as previously described. Thus, in this implementation, the feedback structure stabilizes the state (x) node, which otherwise has a tendency to charge up and down the power rail all the time, and this structure is very resilient to problems such as transistor mismatch. Figure 3 thus shows this modified dynamic cellular model where the polarity of the inputs and weights have been assumed, eg the individual B and A elements are non-negative.

图4示出了采用模拟乘法器(而不是具有增益的放大器)的另一个实施方案。于是,如图4(b)所示,连接图3实施方案的放大器的反馈(如图4(a)所示)可以用模拟乘法器代替来实现正负号权重。而且,还可以借助于经由图9所示的并联晶体管级联,用数字输入设备修正模拟乘法器增益,来建立数字增益。借助于加入能够表示正负号的乘法器,消除了对样板权重的限制。Figure 4 shows another implementation using analog multipliers instead of amplifiers with gain. Thus, as shown in FIG. 4(b), the feedback to the amplifier connected to the embodiment of FIG. 3 (as shown in FIG. 4(a)) can be replaced by an analog multiplier to implement sign weighting. Furthermore, digital gain can also be established by modifying the analog multiplier gain with a digital input device via a parallel transistor cascade as shown in FIG. 9 . By adding a multiplier that can represent the sign, the restriction on the template weight is removed.

动态蜂窝构造过程Dynamic Cell Construction Process

图5示出了上述动态蜂窝构造的操作过程的一种实施方案。在一个实施方案中,在过程开始之前,确定所有待要执行的操作必须的A、B、x(0)、和偏置值。因此,在一个实施方案中,程序由完成任务所需的各个操作构成,使程序的每个步骤规定一个操作,其中各个操作由待要施加到蜂窝构造的A、B、x(0)、和偏置值的形式定义。在一个实施方案中,有可能从前项操作的结果简单地得到这些数值。在一个实施方案中,程序可以被外部存储或存储在实现此构造的同一个芯片上。在一个变通实施方案中,构造的各个单元可以包含其自身的程序存储器。Fig. 5 shows an embodiment of the operation process of the above-mentioned dynamic cellular structure. In one embodiment, the A, B, x(0), and offset values necessary for all operations to be performed are determined before the process begins. Thus, in one embodiment, the program consists of the individual operations required to accomplish the task, such that each step of the program specifies an operation, where each operation consists of A, B, x(0), and Formal definition of the offset value. In one embodiment, it is possible to derive these values simply from the results of the preceding operations. In one embodiment, the program can be stored externally or on the same chip that implements the configuration. In an alternative embodiment, each unit of the fabric may contain its own program memory.

回过来参照图5,通过下列过程,可以实现上述动态蜂窝构造的操作过程的一个实施方案:Referring back to Fig. 5, an embodiment of the operation process of the above-mentioned dynamic cellular structure can be realized through the following process:

1.开始1. start

2.若完成前项操作或若需要新的数据,则取得新的输入。这些输入(标有u的节点的数值)可以来自从建立在构造中的传感器连续地或分立地取样的数据,或作为变通,可以利用适当的扫描装置从外部数据源扫描到输入节点(u)中。2. If the previous operation is completed or if new data is required, new input is obtained. These inputs (the values of the nodes marked u) can come from data sampled continuously or discretely from sensors built into the configuration, or alternatively, can be scanned from an external data source to the input node (u) using a suitable scanning device middle.

3.从外部或从内部存储的程序存储器,将A、B、和偏置值施加到正确的节点。3. Apply A, B, and bias values to the correct nodes, either externally or from internally stored program memory.

4.将初始状态(x(t=0))施加到状态节点。4. Apply the initial state (x(t=0)) to the state node.

5.使时间足以完成计算,亦即使时间足以使电路达到状态数值的改变等于0或具有对输入激励的足够响应。对于不产生静态输出的有源传感器,可能存在一个特定的最佳即所希望的瞬时来对其输出进行取样。另一种选择是输出本身是一种时间变量波形。5. Sufficient time to complete the computation, ie sufficient time for the circuit to reach a change in state value equal to 0 or to have a sufficient response to input stimuli. For active sensors that do not produce a static output, there may be a particular optimal or desired instant at which to sample their output. Another option is for the output itself to be a time-varying waveform.

6.读取并存储状态(x)或输出节点(y)。每个节点的状态或输出可以被外部或内部存储。6. Read and store state (x) or output node (y). The state or output of each node can be stored externally or internally.

7.若程序终止,则到“结束”。7. If the program terminates, go to "end".

8.到(2)。8. To (2).

9.结束。9. End.

上述过程步骤于是示出了动态蜂窝构造的操作过程的一个实施方案。The process steps described above thus illustrate one embodiment of the operational process of a dynamic cellular configuration.

动态蜂窝构造-取样实现Dynamic Cellular Structure - Sampling Realization

图6示出了简单5×5阵列动态蜂窝构造实际实现的一个实施方案。图6示出了根据本发明的动态蜂窝构造,在蜂窝芯片中实现的集成光传感和处理构造。可以理解的是,本发明的创新概念可以被用于各种各样不同的技术,并以各种各样不同的实际构造来实现,而且不意味着本发明局限于集成光传感和处理构造。在一个实施方案中,根据本发明的动态蜂窝构造的集成光传感和处理构造由相同的单个传感器处理器单元的阵列组成,各包含:(1)光电二极管和有源光传感电路;(2)用来前馈样板权重倍增的跨导放大器;(3)用来反馈样板权重的宽频带跨导放大器;(4)模拟/单一位数字局部动态存储器单元;(5)用于传送的数据总线;(6)局部可编程逻辑;以及(7)读出/写入数据控制。Figure 6 shows one embodiment of a practical implementation of a simple 5x5 array dynamic cellular configuration. Figure 6 shows the integrated light sensing and processing architecture implemented in a cellular chip according to the dynamic cellular architecture of the present invention. It will be appreciated that the inventive concepts of the present invention can be used in a wide variety of different technologies and implemented in a wide variety of different practical configurations, and are not meant to limit the invention to integrated light sensing and processing configurations . In one embodiment, the integrated light sensing and processing architecture of a dynamic cellular fabric according to the present invention consists of an array of identical single sensor processor units, each comprising: (1) a photodiode and an active light sensing circuit; 2) Transconductance amplifier for feedforward template weight multiplication; (3) Broadband transconductance amplifier for feedback template weight; (4) Analog/single-bit digital local dynamic memory unit; (5) Data for transmission bus; (6) local programmable logic; and (7) read/write data control.

在图6所述结构的实施方案中,可以用例如由方程(4)或(7)中的A和B定义的一对3×3克隆样板,来执行蜂窝范例的各个操作。通常,能够在短时间内完成每个操作,同时在整个图象上各个元件中进行。这意味着顺序处理器的前所未有的处理速率改进。在一个实施方案中,操作包括输入的线性转换(卷积操作)、正连接的部件探测、以及反馈权重允许的其它类型的数据控制。例如,某些示例性操作是平面位置、诸如扩大、减薄和腐蚀之类的形貌算子、先适应性、划痕清除、织构、颜色、以及形状分析。此外,在一个实施方案中,利用前面得到的结构对状态的初始化,还有可能采用二个样板阵列来实现诸如运动分析的暂时操作,例如局部速度探测和运动方向探测。In an embodiment of the structure depicted in Figure 6, the operations of the cellular paradigm can be performed using a pair of 3x3 clone templates, eg defined by A and B in Equations (4) or (7). In general, each operation can be performed in a short time, simultaneously in each element on the entire image. This means unprecedented processing rate improvements for sequential processors. In one embodiment, operations include linear transformations of the input (convolution operations), detection of positively connected components, and other types of data manipulation allowed by feedback weights. For example, some exemplary operations are planar position, topography operators such as dilation, thinning, and erosion, pre-adaptation, scratch removal, texture, color, and shape analysis. In addition, in one embodiment, using the previously obtained structure to initialize the state, it is also possible to use two template arrays to implement temporal operations such as motion analysis, such as local velocity detection and motion direction detection.

在图6的实施方案中,整个蜂窝芯片被描述成一个系统。在此实施方案中,外部(或国际电缆芯线)微控制器产生集成传感器处理器所需的命令信号,例如传感器定时、行选择、和程序码。在一个变通实施方案中,根据手头特定应用的计算需要,可以用处理器或数字信号处理器来代替微控制器。同样,在一个实施方案中,程序存储器可以是内部的,其中更密集的地址使水平编码为成位级微指令并确定样板数值和构造中的数据传送。In the embodiment of Figure 6, the entire cellular chip is depicted as a system. In this embodiment, an external (or international cable core) microcontroller generates the command signals required by the integrated sensor processor, such as sensor timing, row selection, and program code. In an alternative embodiment, a processor or digital signal processor may be used in place of the microcontroller, depending on the computing needs of the particular application at hand. Also, in one embodiment, the program memory may be internal, where the denser address levels are encoded into bit-level microinstructions and determine the boilerplate values and data transfers in the structure.

在图6的实施方案中,更详细地描述了单元的内容。例如,以简图形式示出了也集成传送功能的前馈和反馈权重。此外,同样以相似的方式示出了局部逻辑和存储器功能。在一个实施方案中,存在着其上能够发生数据传送的主数据总线。于是能够在数据总线与2个模拟存储器单元和4个数字存储器单元、状态(x)和输入(u)之间形成2路连接。还能够形成从逻辑输出和参考电压到数据总线的连接。此外,借助于相似于可编程逻辑阵列的编程,能够实现逻辑。同样,出现可重构逻辑阵列代表另一个选择或实施方案。In the embodiment of Figure 6, the contents of the cells are described in more detail. For example, feedforward and feedback weights that also integrate the transfer function are shown in diagram form. Furthermore, local logic and memory functions are also shown in a similar manner. In one embodiment, there is a main data bus over which data transfers can occur. It is then possible to form a 2-way connection between the data bus and the 2 analog and 4 digital memory cells, the state (x) and the input (u). Connections can also be made from logic outputs and reference voltages to the data bus. Furthermore, logic can be implemented by means of programming similar to a programmable logic array. Likewise, the advent of reconfigurable logic arrays represents another option or implementation.

如上所述,可以理解的是,本发明的创新概念可以应用于各种不同的技术,并以各种各样不同的实际构造来实现,而且,不意味着本发明局限于集成先传感和处理构造。例如,相同的构造可以被应用于诸如声学即声基系统,其中存在着排列成一维、二雏、或三维网格的多个传感器。网格上的各个传感器则可以响应同一个物理量(例如可听见的声谱的频率)。于是,各个传感器可以传感同一个信号的不同的属性(例如,各个传感器被调节到可听见的声谱的不同频率)。这就可测量入射到阵列上的声音的频率图形。而且,还可以想象将此二者进行混合,例如测量彩色图形。As stated above, it is to be understood that the inventive concept of the present invention may be applied to a variety of different technologies and implemented in a variety of different practical configurations, and that the present invention is not meant to be limited to the integration of pre-sensing and Handle construction. For example, the same configuration can be applied to applications such as acoustic or sound-based systems where there are multiple sensors arranged in a one-dimensional, two-dimensional, or three-dimensional grid. Each sensor on the grid can then respond to the same physical quantity (such as the frequency of the audible sound spectrum). Thus, individual sensors may sense different properties of the same signal (eg, each sensor tuned to a different frequency of the audible sound spectrum). This measures the frequency pattern of the sound incident on the array. Furthermore, it is also conceivable to mix the two, for example to measure color graphics.

几种蜂窝模型电路的实现Realization of several honeycomb model circuits

图7示出了根据本发明概念的CMOS芯片的布局的一个实施方案,它包含几种蜂窝单元(例如采用蜂窝动态构造单元原型的VLSI)。在一个实施方案中,CMOS芯片设计可以做到MOSIS 2微米ORBITANALOG工艺。同样,在一个实施方案中,通过键合到40插脚DIP其面积为2.3mm×2.3mm的“TINYCHIP”封装件,实现了管芯尺寸。Figure 7 shows an embodiment of the layout of a CMOS chip according to the inventive concept, which contains several cellular cells (eg VLSI using the cellular dynamic construction cell prototype). In one embodiment, the CMOS chip design can be done in MOSIS 2 micron ORBITANALOG process. Also, in one embodiment, the die size is achieved by bonding to a 40-pin DIP "TINYCHIP" package with an area of 2.3 mm x 2.3 mm.

如图7的实施方案所述,在CMOS芯片的一个实施方案中,CMOS芯片包含11种单元:4个初始化的1前馈-1反馈单元;2个可编程类型(正或负)反馈单元;2个硬连接反馈和状态可初始化的单元;2个(+/-)硬连接反馈和浮置状态单元;以及1个只前馈的单元。在一个实施方案中,可通过输出处的宽频带输出放大器得到所有合用的输出,且所有可编程单元参数,亦即前馈和反馈权重,以及偏压处于3位分辨率(正号位)。通过插脚可得到外部输入,从而设定这些权重。As described in the embodiment of Figure 7, in one embodiment of the CMOS chip, the CMOS chip includes 11 types of units: 4 initialized 1 feedforward-1 feedback units; 2 programmable type (positive or negative) feedback units; 2 hardwired feedback and state initializeable cells; 2 (+/-) hardwired feedback and floating state cells; and 1 feedforward-only cell. In one embodiment, all useful outputs are available through broadband output amplifiers at the outputs, and all programmable cell parameters, ie, feedforward and feedback weights, and bias voltages are at 3-bit resolution (positive sign bits). External inputs are available through pins to set these weights.

图8-11各示意地示出了这些单元。图8示出了由4个单元以紧凑方式组成的一个示意实施方案:4个初始化的1前馈-1反馈单元。在图7芯片的顶部,有4个1输入1输出单元。有4种这类简单的1输入1输出简单单元。由于能够实现正反馈和负反馈的置换的4种方法,而出现4种类型。+/-端子的4种组合是可能的,且所有这些都可以实现。而且,所有4个单元包括状态初始化装置,以便在初始时刻(t=0)定义x。这意味着能够在特定微电子设备能够承载的任何电压下对状态进行初始化。这等效于在方程(4)和(7)中定义初始化点x|t=0。在这一特定的装置中,前馈放大器和反馈放大器都具有3位可编程增益。Figures 8-11 each schematically illustrate these units. Figure 8 shows a schematic implementation consisting of 4 units in a compact manner: 4 initialized 1 feedforward-1 feedback units. On the top of the chip in Figure 7, there are four 1-input-1-output units. There are four types of simple 1-input-1-output simple units of this type. Four types emerge due to four methods capable of realizing positive feedback and negative feedback permutation. 4 combinations of +/- terminals are possible and all of them can be realized. Furthermore, all 4 units include state initialization means to define x at an initial instant (t=0). This means that states can be initialized at any voltage that a particular microelectronic device is capable of carrying. This is equivalent to defining the initialization point x| t=0 in equations (4) and (7). In this particular setup, both the feedforward and feedback amplifiers have 3-bit programmable gains.

图9示出了二种可编程类型(正或负)反馈单元的示意实施方案。可以实现二种3输入1输出可选择类型(+或-)的反馈单元。其中之一还能够以与图8所述完全相同的方式对状态进行初始化。图9示出了未被初始化的这种单元。在一个单元实施方案中,如所示,在状态反馈中存在着窗口晶体管的增加的延迟。根据反馈正负号选择位,二个信号(状态节点x和Vref)被引到反馈权重放大器的端子。在一个实施方案中,所有放大器具有3位可编程增益。Figure 9 shows a schematic implementation of two programmable types (positive or negative) feedback units. Two 3-input 1-output selectable types (+ or -) of the feedback unit can be realized. One of them can also initialize the state in exactly the same way as described in FIG. 8 . Figure 9 shows such a unit uninitialized. In one cell implementation, as shown, there is an added delay of the window transistor in the state feedback. Depending on the feedback sign selection bit, two signals (state node x and Vref) are routed to the terminals of the feedback weight amplifier. In one implementation, all amplifiers have 3-bit programmable gains.

图10示出了二个硬连接反馈和状态可初始化的单元的示意实施方案。包括图10所述的2个初始化的3输入1输出单元。这些单元仅仅稍许不同于图9所述的单元。在一个实施方案中,反馈类型被直接实现而不是可编程的。+/-端子的二种组合是可能的,且都已经被实现。在一个实施方案中,二种放大器都具有3位可编程增益。Figure 10 shows a schematic implementation of two hard-wired feedback and state-initializable units. Consists of 2 initialized 3-input 1-output units as described in Figure 10. These units differ only slightly from the units described in FIG. 9 . In one embodiment, the feedback type is implemented directly rather than being programmable. Two combinations of +/- terminals are possible and both have been implemented. In one implementation, both amplifiers have 3-bit programmable gains.

图11(a)示出了二个(+/-)硬连接反馈和浮置状态单元的示意实施方案。在一个实施方案中,这些单元完全相同于图10所述的,但不包含状态初始化电路。状态的初始值于是未被定义。Figure 11(a) shows a schematic implementation of two (+/-) hardwired feedback and floating state cells. In one embodiment, these cells are identical to those described in Figure 10, but do not include state initialization circuitry. The initial value of the state is then undefined.

图11(b)示出了一个只前馈的单元的示意实施方案。这是一种具有可编程偏置的3输入1输出单元。在一个实施方案中,反馈部件被取消。此单元于是偏离蜂窝神经网络范例而只可用于前馈操作。Figure 11(b) shows a schematic implementation of a feed-forward only unit. This is a 3-input, 1-output unit with programmable bias. In one embodiment, the feedback feature is eliminated. This unit then deviates from the cellular neural network paradigm and can only be used for feed-forward operations.

图13示出了前馈单元的一种实施方案。在一个实施方案中,输入u总是非负的,但具有有正负号的表征,而B样板的正输入选择u+,负输入选择u-。Figure 13 shows an embodiment of a feedforward unit. In one embodiment, the input u is always non-negative but has a signed representation, while u+ is chosen for positive inputs and u- is chosen for negative inputs of the B template.

前馈权重和图象卷积:取样操作Feedforward Weights and Image Convolution: A Sampling Operation

卷积是一种进行许多影象任务的非常普遍的图象处理步骤。它常常被用作从原版产生二次(不同的)图象的一种技术,其中所需的特征被增强和/或不希望的特征(例如噪声)被抑制。在一个实施方案中,卷积可以被描述为一种连续空间操作或暂时操作,但其对被取样图象的应用是分立的并涉及到具有分立数值的卷积影响函数。Convolution is a very common image processing step for many imaging tasks. It is often used as a technique to generate a secondary (different) image from an original in which desired features are enhanced and/or undesired features (eg noise) suppressed. In one embodiment, convolution may be described as a continuous spatial operation or a temporal operation, but its application to the sampled image is discrete and involves a convolution influence function with discrete values.

图象I和影响函数k之间的表示的这一分立卷积操作可以描述为: I ⊗ k | x . y = Σ i = - N N Σ j = - N N I x + i . y + j k i . j This discrete convolution operation denoted by  between image I and influence function k can be described as: I ⊗ k | x . the y = Σ i = - N N Σ j = - N N I x + i . the y + j k i . j

    公式10Formula 10

其中,I是输入图象,x和y是二雏图象坐标,而k是(2N+1×2N+1)平方影响函数。Among them, I is the input image, x and y are the coordinates of the second image, and k is the (2N+1×2N+1) square influence function.

可以容易地看到方程(1)中的前馈权重输入乘积之和 Σ kl ∈ N r ( ij ) B kl ( u kl ( t ) , u ij ( t ) , ) 与方程(10)中的和数之间的相似性。It can be easily seen that the sum of feedforward weight input products in equation (1) Σ kl ∈ N r ( ij ) B kl ( u kl ( t ) , u ij ( t ) , ) Similarity with the sum in equation (10).

于是,若设定I=0和A=0,则方程(1)的稳态解为 x ij ( t ) = Σ kl ∈ N r ( ij ) B ij ; kl ( u kl ( t ) , u ij ( t ) ) Then, if I=0 and A=0 are set, the steady-state solution of equation (1) is x ij ( t ) = Σ kl ∈ N r ( ij ) B ij ; kl ( u kl ( t ) , u ij ( t ) )

这实际上等效于Ik|x,yi,j,其中第二项是归一化影响函数。This is actually equivalent to Ik|x,y i,j , where the second term is the normalized influence function.

所述的VLSI采用的模型执行归一化操作,它用方程(4)中的 Σ kl ∈ N r ( ij ) B kl tanh [ qK 2 kT ( u kl ( t ) - x ij ( t ) ) ] The model adopted by the VLSI performs a normalization operation, which uses the equation (4) Σ kl ∈ N r ( ij ) B kl tanh [ QUR 2 kT ( u kl ( t ) - x ij ( t ) ) ]

代替方程(1)中的标准模型的前馈权重乘以输入数值的求和项 Σ kl ∈ N r ( ij ) B kl ( u kl ( t ) ) Instead of the standard model's feed-forward weights multiplied by the sum of the input values in equation (1) Σ kl ∈ N r ( ij ) B kl ( u kl ( t ) )

除了输出被影响函数输入之和归一化之外,n表示的归一化卷积非常相似于常规卷积操作。由于影响函数在整个图象上是相同的,故结果是基本上除以整个图象公共的归一化因子。优点是得到的图象的动态范围与输入图象基本上相同。 I ⊗ n k | x , y = Σ i = - N N Σ j = - N N I x + i , y + j k i , j Σ i = - N N Σ j = - N N k i , j The normalized convolution denoted by  n is very similar to the regular convolution operation, except that the output is normalized by the sum of the influence function inputs. Since the influence function is the same over the entire image, the result is essentially divided by a normalization factor common to the entire image. The advantage is that the resulting image has essentially the same dynamic range as the input image. I ⊗ no k | x , the y = Σ i = - N N Σ j = - N N I x + i , the y + j k i , j Σ i = - N N Σ j = - N N k i , j

          公式11Formula 11

为了实现上述归一化卷积,用对应于图象的输入电压和来自影响函数输入的增益,来加载输入电压值。由于电导(或增益值)必须为正,故为了用此方法实现负影响函数值,需要能够定义负输入电压和选择希望负影响函数值的对跨导放大器的输入的负极性。虽然影响函数的输入范围仍然不受影响,但这切取了图象动态范围的一半。倘若忽略偏置放大器,可以据此看到图11(b)中的单元安排。To implement the normalized convolution described above, the input voltage value is loaded with the input voltage corresponding to the image and the gain from the influence function input. Since the conductance (or gain value) must be positive, to achieve negativity with this method, one needs to be able to define a negative input voltage and select the negative polarity of the input to the transconductance amplifier for which negativity is desired. Although the input range of the influence function is still unaffected, this cuts off half of the dynamic range of the image. If the bias amplifier is ignored, the cell arrangement in Figure 11(b) can be seen accordingly.

取样操作Sampling operation

例如,在取样操作中,图12所示的4位正号表述(经由并联晶体管级联的数字输入装置)可以被用作方程(1)和(4)中的B样板的输入,并以0.25的增量在-3.75到3.75的范围内取值。在一个实施方案中,倘若放大器的增益被B样板送出,则此单元等效于图11(a)所示的前馈配置。利用更多的面积和更多的位数,有可能得到更高的分辨率和动态范围。For example, in a sampling operation, the 4-bit plus-sign representation shown in Figure 12 (via a parallel transistor cascaded digital input device) can be used as the input to the B template in equations (1) and (4), scaled by 0.25 The increments take values in the range -3.75 to 3.75. In one embodiment, this unit is equivalent to the feed-forward configuration shown in Figure 11(a), provided that the gain of the amplifier is fed by the B-plate. With more area and more bits, higher resolution and dynamic range are possible.

在取样操作中,其它参量可以如下:In the sampling operation, other parameters can be as follows:

0>u<10>u<1

u+=Vref+uu+=Vref+u

u-=Vref-uu-=Vref-u

-3.75<b<3.75-3.75<b<3.75

umin<x<umaxumin<x<umax

在此构造中,Vref可以是0的表述。Vref的一个假设值是微电子装置的地和电源之间的中点。In this configuration, Vref may be a representation of zero. An assumed value for Vref is the midpoint between the ground and power supply of the microelectronic device.

正负号的实现是根据u+或u-的选择。注意这也产生实现负u的机会。换言之,在此结构中也有可能适应u<0的数值。在此情况下,参数可以如下:The implementation of the sign is based on the choice of u+ or u-. Note that this also creates the opportunity to implement negative u. In other words, it is also possible to accommodate values of u<0 in this configuration. In this case, the parameters can be as follows:

-1<u<1-1<u<1

u+=Vref+uu+=Vref+u

u-=Vref-uu-=Vref-u

在构造中实现样板B的不同数值的范围-3.75到3.75的具体实现,产生下列结果:A concrete realization of the range -3.75 to 3.75 for different values of template B in the construction yields the following results:

  b=    b sign(v)  b3   (MSB)b2  b1   b0(LSB)b = b sign(v) b 3 (MSB) b 2 b 1 b 0 (LSB)

  -3.75    0V       1V         1V    1V    1V-3.75 0V 1V 1V 1V 1V

  -2.00    0V       1V         0V    0V    0V-2.00 0V 1V 0V 0V 0V

  -1.00    0V       0V         1V    0V    0V-1.00 0V 0V 1V 0V 0V

  -0.50    0V       0V         0V    1V    0V-0.50 0V 0V 0V 1V 0V

  -0.25    0V       0V         0V    0V    1V-0.25 0V 0V 0V 0V 1V

   0.25    5V       0V         0V    0V    1V0.25 5V 0V 0V 0V 1V

   0.50    5V       0V         0V    1V    0V0.50 5V 0V 0V 1V 0V

   1.00    5V       0V         1V    0V    0V1.00 5V 0V 1V 0V 0V

   2.00    5V       1V         0V    0V    0V2.00 5V 1V 0V 0V 0V

   3.75    5V       1V         1V    1V    1V3.75 5V 1V 1V 1V 1V

其中电源为5V,逻辑1为5V。可以选择1V作为偏压以便使偏置晶体管在阈值附近工作。在一个实施方案中,状态节点(x)以Vref为参考,且不需要有正负号的表述。反馈样板“A”的正输入数值可以被用来将x引导到宽频带微分放大器的+端子,而负输入将其引导到-端子。在4位正号表述中,A的数值也能够以0.25的增量在-3.75到3.75的范围内取值。利用更多的面积和更多的位数,有可能得到更高的分辨率和动态范围。其它的参数如下:-3.75<aij<3.75以及0<x<Vdd.Where power is 5V and logic 1 is 5V. 1V can be chosen as the bias voltage in order to make the bias transistor work near the threshold. In one embodiment, the state node (x) is referenced to Vref and does not require a signed representation. The positive input value of the feedback template "A" can be used to direct x to the + terminal of the wideband differential amplifier, while the negative input directs it to the - terminal. In the 4-digit positive sign representation, the value of A can also take values in the range -3.75 to 3.75 in increments of 0.25. With more area and more bits, higher resolution and dynamic range are possible. Other parameters are as follows: -3.75<aij<3.75 and 0<x<Vdd.

这样,前馈权重的反馈配置就提供了某些安全性来对抗有源电流计算单元被吸引到电源线的倾向。In this way, the feedback configuration of the feedforward weights provides some safety against the tendency of the active current computation unit to be attracted to the power supply line.

用CCD相机捕捉到了二个实际图象,并同时提供给芯片3个输入。输入的3个权重被设定来代表1、0、-1组成的一维垂直平面影响函数。在本取样操作中,各个象素以这种方式被提供。在取样操作中,沿水平方向到图11(b)所示的偏置放大器被关闭的模型电路,整个图象被同时提供3个象素,结果示于图14。Two actual images are captured by a CCD camera, and three inputs are provided to the chip at the same time. The three input weights are set to represent the one-dimensional vertical plane influence function composed of 1, 0, and -1. In this sampling operation, individual pixels are provided in this manner. In the sampling operation, along the horizontal direction to the model circuit shown in FIG. 11(b), the bias amplifier is turned off, and the entire image is provided with 3 pixels at the same time. The result is shown in FIG. 14.

双重分布构造double distribution structure

如图15所示,本发明的一个实施方案引入了一种5×5阵列的双重分布构造。在一个实施方案中,双重分布构造包含二个有区别的结构:(i)可编程卷积阵列PCA,一种具有卷积功能的传感网格,它还可以包括一些短期存储器,以及(ii)PLA,一种可编程逻辑阵列和其上能够产生传感数据的转换即不同的表述的存储器区。在一个实施方案中,二个区域并行计算并按需要以串行随机存取方式彼此通信。同样,可编程蜂窝逻辑阵列(PCLA)主要与外部区域通信,此外部区域可能包括常规数字信号处理器、微处理器、或微控制器。As shown in Figure 15, one embodiment of the present invention incorporates a dual distribution configuration of a 5x5 array. In one embodiment, the dual distribution architecture consists of two distinct structures: (i) the Programmable Convolution Array PCA, a sensing grid with convolution capabilities, which may also include some short-term memory, and (ii ) PLA, a programmable logic array and memory areas on which transformations or different representations of sensed data can be generated. In one embodiment, the two regions compute in parallel and communicate with each other in serial random access as needed. Likewise, the Programmable Cellular Logic Array (PCLA) primarily communicates with an external area, which may include a conventional digital signal processor, microprocessor, or microcontroller.

考虑到装置媒质(最通常是硅的微电子电路)的速度对大面积蜂窝处理器的要求的影响,再次考虑上述动态蜂窝结构中列举的具体装置的单元的元件,亦即:(1)光电二极管和有源光传感电路;(2)用来前馈样板权重倍增的跨导放大器;(3)用来反馈样板权重的宽频带跨导放大器;(4)模拟/单一位数字局部动态存储器单元;(5)用于传送的数据总线;(6)局部可编程逻辑;以及(7)读出/写入数据控制。Taking into account the effect of the speed of the device medium (most commonly silicon microelectronics) on the requirements of large-area cellular processors, consider again the elements of the unit of the specific device enumerated in the dynamic cellular structure above, namely: (1) optoelectronics Diode and active light sensing circuit; (2) transconductance amplifier for feed-forward template weight multiplication; (3) broadband transconductance amplifier for feedback template weight; (4) analog/single-bit digital local dynamic memory unit; (5) data bus for transfer; (6) local programmable logic; and (7) read/write data control.

对集成蜂窝传感器-处理器系统的实际关注和实现,通常指下列考虑:(1)传感器可制造并埋置在处理系统中;(2)系统传感部分的分辨率接近市面可获得的;(3)填充因子(各个单元的传感器面积对总面积的比率)对于应用来说合理;(4)随着工艺尺寸降低而保持传感器性能;(5)从大的近邻取数据的一组算符量值。这些仅仅是实现集成蜂窝传感器-处理器系统时考虑的因素,但不必具体地遵守,因此,上述的因素不是用来限制本发明的范围。Practical concerns and implementations of integrated cellular sensor-processor systems typically refer to the following considerations: (1) the sensor can be fabricated and embedded in the processing system; (2) the resolution of the sensing portion of the system is close to that available in the market; ( 3) Fill factor (ratio of sensor area to total area of each cell) is reasonable for the application; (4) Maintain sensor performance as process size decreases; (5) A set of operators that fetch data from large neighbors value. These are merely considerations when implementing an integrated cellular sensor-processor system, but need not be specifically followed, and thus, the above-mentioned factors are not intended to limit the scope of the present invention.

虽然如此,采用系统的方法来满足上述目的,通常还应该指出下列因素:(1)为了改进分辨率,必须缩小单元;(2)为了改进填充因子,必须提高光敏面积;(3)为了保持传感器工作,可能需要用更大规模的工艺来建立传感器(或整个芯片),这又意味着必须使单元更小;以及(4)为了增大近邻尺寸,可能必须建立更大的连接更多近邻的单元。这些再次仅仅是实现集成蜂窝传感器-处理器系统时考虑的因素,但不必具体地遵守,因此,上述的因素不是用来限制本发明的范围。Even so, adopting a systematic approach to meet the above objectives should generally point out the following factors: (1) In order to improve the resolution, the unit must be reduced; (2) In order to improve the fill factor, the photosensitive area must be increased; (3) In order to maintain the sensor work, the sensor (or the entire chip) may need to be built with a larger scale process, which in turn means the cells must be made smaller; and (4) to increase the neighbor size, larger connections may have to be made unit. These again are only considerations when implementing an integrated cellular sensor-processor system, but need not be specifically followed, and thus, the aforementioned factors are not intended to limit the scope of the invention.

以这种方式,还能够实现(在前馈连接中)任意尺寸近邻,或如在图象处理中通常所称的影响函数,它包括大如整个传感器阵列本身的影响函数,但同时仅仅具有一个这样的影响函数。这使装置媒质的速度影响到处理器的面积要求。In this way, it is also possible to realize (in feed-forward connections) arbitrarily sized neighbors, or influence functions as they are commonly called in image processing, which include influence functions as large as the entire sensor array itself, but at the same time have only one Such an influence function. This allows the speed of the device medium to affect the area requirements of the processor.

比之当前实现的现有构造,双重分布构造通常提供3个明显的优点。The dual distribution configuration generally offers 3 distinct advantages over currently implemented existing configurations.

第一,构造的计算概念比采用非线性电路动力学的现有技术的蜂窝神经网络概念更简单且更方便得多。而且,为实现这一构造,所需对实施工程师的培训少得多。比之当前实现的现有构造,对双重分布构造中的操作进行编程的方法,比需要确定或找到正确的参数A、B、x(0)和I的方法来说,更直接得多。First, the computational concept of construction is simpler and much more convenient than the prior art cellular neural network concept employing nonlinear circuit dynamics. Also, much less training of implementing engineers is required to implement this configuration. The method of programming operations in the dual distribution configuration is much more straightforward than the method required to determine or find the correct parameters A, B, x(0) and I than currently implemented existing configurations.

第二,由于非线性动力学的作用被减为最小,故诸如其上建立构造的衬底固有的不匹配或噪声之类的电路反常对计算输出的影响小得多。这使装置的制造更容易得多,且成本效益更高。Second, since the effects of nonlinear dynamics are minimized, circuit anomalies such as mismatches or noise inherent to the substrate on which the construct is built have much less impact on the computational output. This makes the manufacture of the device much easier and more cost effective.

第三,双重分布构造正如其名称所意味的那样,包含二个部分,且其中每个部分有分别的功能。换言之,各个部分各自起作用,并能够被单独实现。双重分布构造的第一部分是可编程卷积阵列PCA,在一个实施方案中,它包含具有卷积功能的传感网格,还能够包括一些短期存储器。可编程卷积阵列PCA部分执行整个传感数据或部分传感数据的任意线性转换。各个元件的所需权重能够被直接编程。第二部分双重分布构造是可编程蜂窝逻辑阵列(PCLA)。在一个实施方案中,PCLA是蜂窝状,这意味着阵列的各个元件连接到其一组近邻。但PALA不是神经网络,且在一个实施方案中依赖于常规逻辑来执行其操作。在一个实施方案中,能够用诸如AND、OR、XOR之类的常规数字逻辑或非常规模拟逻辑门来实现实际的逻辑,但不局限于AND、OR、XOR。此外,在一个实施方案中,在各个元件处能够有状态机(亦即小型计算机)。对于可独立应用的操作,PALA的各个元件能够配备有传感器,或能够从外部源将数据扫描进入。Third, the dual distribution construct, as its name implies, consists of two parts, each of which has a separate function. In other words, each part functions independently and can be realized independently. The first part of the dual distribution configuration is the Programmable Convolution Array PCA, which in one embodiment contains a sensing grid with convolution capabilities and can also include some short-term memory. The programmable convolutional array PCA section performs arbitrary linear transformations of the entire sensed data or parts of the sensed data. The desired weights of individual elements can be programmed directly. The second part of the dual distribution fabric is the Programmable Cellular Logic Array (PCLA). In one embodiment, the PCLA is cellular, meaning that each element of the array is connected to its set of neighbors. But PALA is not a neural network, and in one embodiment relies on conventional logic to perform its operations. In one embodiment, the actual logic can be implemented with conventional digital logic such as, but not limited to, AND, OR, XOR, or non-conventional analog logic gates. Furthermore, in one embodiment, there can be state machines (ie, minicomputers) at the various elements. For independently applicable operation, individual elements of the PALA can be equipped with sensors, or can scan in data from external sources.

有正负号的输出传感器(SOS)Signed output sensor (SOS)

至于卷积功能的某些类型的实现,例如使用非负增益放大器电路的那些,可能各个传感器必须产生双重输出。As with certain types of implementations of convolution functions, such as those using non-negative gain amplifier circuits, it may be that each sensor must produce a dual output.

因此,根据传感器的性质,产生这种双重输出的方法可能不同。对于参照0(例如光)测量非负数值的传感器,可能需要(+)和(-)象素输出。正(+)输出若包含正的影响函数数值,则被用来馈送到放大器中。相应地,负(-)象素输出可以被用于负影响函数数值。下面给出一个一维例子。Therefore, depending on the nature of the sensor, the method of producing this dual output may differ. For sensors that measure non-negative values with reference to 0 (eg, light), (+) and (-) pixel outputs may be required. Positive (+) outputs are used to feed amplifiers if they contain positive influence function values. Accordingly, negative (-) pixel outputs can be used to negatively influence function values. A one-dimensional example is given below.

假设需要实现光敏网格上的由[10-1]组成的一维平面影响函数。等效算术运算是求和象素[i-1]+(-象素[i+1])。由于能够使用电流求和,故需要将象素[i-1]的正表述的电流加到象素[i+1]的负表述的电流中。Assume that it is necessary to implement a one-dimensional planar influence function composed of [10-1] on a photosensitive grid. The equivalent arithmetic operation is to sum pixels[i-1]+(-pixels[i+1]). Since current summation can be used, it is necessary to add the positive representation of current for pixel [i-1] to the negative representation of current for pixel [i+1].

图16示出了这种单符号输出有源传感器的示例性布局的实施方案。在图16的实施方案中,有源区是100λ×100λ的正方形,且整个单元覆盖195λ×130λ的区域。未示出覆盖不应该暴露于光的单元区域的(接地的)METAL2层。Figure 16 shows an embodiment of an exemplary layout of such a single-sign output active sensor. In the embodiment of Figure 16, the active area is a 100λ x 100λ square, and the entire cell covers an area of 195λ x 130λ. The (grounded) METAL2 layer covering areas of the cell that should not be exposed to light is not shown.

同样,图17示出了有正负号的传感器的相应示意图,其中注明了正负象素的表述。在一个实施方案中,由于光电流非常小且先传感节点不应该被干扰,故使用了额外的输出器。Likewise, Figure 17 shows a corresponding schematic diagram of a signed sensor, where the representation of the plus and minus pixels is noted. In one embodiment, since the photocurrent is very small and the sensing node should not be disturbed, an additional follower is used.

可编程卷积阵列(PCA)Programmable Convolution Array (PCA)

图18示出了光传感PCA的元件设计的实施方案。在一个实施方案中,各个象素单元可以被行和列选择信号的组合寻址。同样,权重位也因而能够被写入到选择的象素中。象素中点确定了有正负号的象素数值表述的“0”值,亦即无光。对象素进行复位将使象素输出达到此数值。在一个实施方案中,由于覆盖电路的金属层(在光敏区域有窗口)能够被接地,故无须引导专用的接地信号。在图18的实施方案中,各个单元包含:(1)一个有正负号的输出传感器(SOS);(2)用来存储3位权重和正负号的4个D触发器;(3)4个倍增器;(4)连接成单位输出器结构的宽频带跨导放大器;以及(5)3个输入AND门。Figure 18 shows an embodiment of the elemental design of a light-sensing PCA. In one embodiment, individual pixel cells are addressable by a combination of row and column select signals. Likewise, weight bits can thus be written to selected pixels. The pixel midpoint defines the "0" value represented by the signed pixel value, ie no light. Resetting the pixel will bring the pixel output to this value. In one embodiment, there is no need to route a dedicated ground signal since the metal layer covering the circuit (with the window in the photosensitive area) can be grounded. In the embodiment of Fig. 18, each unit comprises: (1) a signed output sensor (SOS); (2) 4 D flip-flops for storing 3-bit weight and sign; (3) 4 multipliers; (4) wideband transconductance amplifiers connected in a unitary output configuration; and (5) 3-input AND gates.

在一个实施方案中,由于如图19中的单个单元的布局所示,有一些空白空间,故能够明显地减小单元的尺寸。更小的特征尺寸和加入的金属层,无疑会使单元进一步紧凑。面积的明显节省可能来自动态存储器单元,因为各个影响函数可能被短期应用。几个数据和控制信号通常被引导到各个单元。In one embodiment, the cell size can be significantly reduced due to some empty space as shown in the layout of a single cell in FIG. 19 . Smaller feature sizes and added metal layers will undoubtedly make the unit more compact. Significant savings in area may come from dynamic memory cells, since individual influence functions may be applied short-term. Several data and control signals are usually directed to each unit.

在一个实施方案中,对所有单元公共的控制信号包括:(1)时钟权重,(2)复位光电二极管(或传感器),以及(3)正负号。寻址信号是行和列。在一个实施方案中,对所有单元公共的数据信号包括:(1)象素参考,(2)倒相器参考,(3)权重高,(4)权重低,(5)权重MSB,(6)权重MID,(7)权重LSB。而且,在一个实施方案中,如图20所示,各个行的元件具有公共的行选择信号,而各个列的元件具有公共的列选择信号。因此,各个单元的输出可以被加和到公共线上。作为变通,可以沿列、行、或块,或沿其它规定来组合输出。In one embodiment, control signals common to all units include: (1) clock weight, (2) reset photodiode (or sensor), and (3) sign. The addressing signals are row and column. In one embodiment, data signals common to all units include: (1) Pixel Reference, (2) Inverter Reference, (3) Weight High, (4) Weight Low, (5) Weight MSB, (6) ) weight MID, (7) weight LSB. Also, in one embodiment, as shown in FIG. 20, the elements of each row have a common row select signal, and the elements of each column have a common column select signal. Thus, the outputs of the individual cells can be summed on a common line. Alternatively, the output may be combined along columns, rows, or blocks, or along other conventions.

图21示出了5×5象素的示例性PCA布局。在图21的实施方案中,25个元件中的每一个使用4个存储在D触发器中的权重位(3个幅度,1个正负号)。因此,仅仅当行和列都被选择时,以及当权重时钟位被激活时,权重位才被钟脉冲输入。权重输出触发权重高与权重低之间的选择。在一个实施方案中,输出级中的宽频带跨导放大器具有3个偏置晶体管,其W/L比率为1/4、1/2、1,各对应于幅度位。若所有的权重都为0,则象素对输出不贡献任何电流。若任何一个权重为非0,则根据正负号,象素对总读出有正或负电流贡献。数据信号、权重高还确定了光传感器的增益。在一个实施方案中,为了节省硅面积,可以用动态存储器代替D触发器。在一个实施方案中,可以在2微米CMOS工艺设计中用其配置成安置在2.3mm×2.3mm面积中并埋置在40插脚DIP芯片封装件内的焊点来实现设计。在此例实现中,所有25各象素输出被加和在公共输出线上。可以选择各个单元的不同的正负号和权重,从而执行任意卷积操作。在这一特例实现中,直到5×5的任意4位影响函数都能够被编程。Figure 21 shows an exemplary PCA layout of 5x5 pixels. In the embodiment of FIG. 21, each of the 25 elements uses 4 weight bits (3 magnitude, 1 sign) stored in a D flip-flop. Therefore, the weight bits are clocked in only when both row and column are selected, and when the weight clock bit is activated. The weight output triggers the selection between weight high and weight low. In one implementation, the broadband transconductance amplifier in the output stage has 3 bias transistors with W/L ratios of 1/4, 1/2, 1, each corresponding to an amplitude bit. If all weights are 0, the pixel does not contribute any current to the output. If either weight is non-zero, the pixel contributes positive or negative current to the total readout, depending on the sign. The high weight of the data signal also determines the gain of the light sensor. In one embodiment, dynamic memory may be used in place of D flip-flops in order to save silicon area. In one embodiment, the design may be implemented in a 2 micron CMOS process design with pads configured to fit in a 2.3mm x 2.3mm area and embedded within a 40-pin DIP chip package. In this example implementation, all 25 pixel outputs are summed on a common output line. Different signs and weights of individual units can be chosen to perform arbitrary convolution operations. In this particular implementation, any 4-bit influence function up to 5x5 can be programmed.

PCA过程PCA process

图22示出了上述可编程卷积阵列(PCA)的操作过程的一个实施方案。在一个实施方案中,在过程开始之前,对所有要执行的操作确定待要施加到阵列的必须的权重形式。因此,在一个实施方案中,程序由完成任务所需的操作构成,使程序的各个步骤规定一个操作,其中各个操作由待要施加到阵列上的权重形式定义。在一个实施方案中,有可能简单地从前项操作的结果得到权重形式。在一个实施方案中,程序可以被外部存储或存储在实现阵列的同一个芯片上。在一个变通实施方案中,阵列的各个单元具有其自身的程序存储器。Figure 22 shows one embodiment of the operation of the programmable convolution array (PCA) described above. In one embodiment, the necessary weight forms to be applied to the array are determined for all operations to be performed before the process begins. Thus, in one embodiment, the program consists of the operations required to accomplish the task, such that each step of the program specifies an operation, where each operation is defined by the form of weights to be applied to the array. In one embodiment, it is possible to derive the weight form simply from the result of the preceding operation. In one embodiment, the program can be stored externally or on the same chip that implements the array. In an alternative implementation, each element of the array has its own program memory.

回过来参照图22,可以通过下列过程来实现上述可编程卷积阵列(PCA)的操作过程的实施方案:Referring back to FIG. 22, the implementation of the operation process of the above-mentioned programmable convolution array (PCA) can be realized through the following process:

1.开始1. start

2.若完成前项操作或若需要新的数据,则取得新的输入。这些输入可以来自从建立在阵列中的传感器连续地或分立地取样的数据,或作为变通,可以利用适当的扫描装置从外部数据源扫描到输入节点中。2. If the previous operation is completed or if new data is required, new input is obtained. These inputs may come from continuously or discretely sampled data from sensors built into the array, or alternatively may be scanned into input nodes from external data sources using suitable scanning means.

3.从外部或从内部程序存储器中,施加指令的权重形式。3. Apply the weighted form of the instruction either externally or from internal program memory.

4.使时间足以完成计算,亦即时间足以使电路达到阵列的输出数值的改变等于0的状态。对于不产生静态输出的有源传感器,可能存在一个特定的最佳瞬时来对其输出进行取样。另一种选择是输出本身是一种时间变量波形。4. Make the time sufficient to complete the calculation, ie the time sufficient for the circuit to reach a state where the change in the output value of the array is equal to zero. For active sensors that do not produce a static output, there may be a particular optimal instant at which to sample their output. Another option is for the output itself to be a time-varying waveform.

5.读取并存储阵列的输出。此输出可以被外部或内部存储。若PCA被用于具有PCLA的级联中,则输出也可以被馈送到PCLA。5. Read and store the output of the array. This output can be stored externally or internally. If PCA is used in cascade with PCLA, the output can also be fed to PCLA.

6.等待下一个操作6. Wait for the next operation

7.若程序终止,则到结束。7. If the program terminates, go to the end.

8.到(2)。8. To (2).

9.结束。9. End.

可编程蜂窝逻辑阵列(PCA)Programmable Cellular Logic Array (PCA)

PCA执行的卷积操作的输出通常是模拟电流或电压值。虽然这种迅速的任意尺寸的影响函数卷积操作是极为有用的,但存在着许多需要进一步处理许多卷积操作结果的预视操作。可编程蜂窝逻辑阵列(PCLA)是实现这一阶段的一种方法。The output of the convolution operation performed by PCA is usually an analog current or voltage value. While such fast arbitrary-sized influence function convolution operations are extremely useful, there are many lookahead operations that require further processing of the results of many convolution operations. Programmable cellular logic arrays (PCLAs) are one way to achieve this stage.

在一个实施方案中,PCLA是用一组安置在蜂窝网格上的固定或可编程逻辑元件来处理二进制即数字数据的一种处理装置。因此,若PCLA与PCA一起使用,则二个阵列的尺寸可以是相同的或不同的。在一个实施方案中,PCLA网格可以用作诸如形状探测、仿形随动、形式匹配之类的一组图象操作的“高速暂存”,所有这些操作都用PCA的输出或外部输出,以可能不同于接收到的数据的分辨率来执行。In one embodiment, a PCLA is a processing device for processing binary or digital data using an array of fixed or programmable logic elements arranged on a cellular grid. Therefore, if PCLA is used with PCA, the dimensions of the two arrays can be the same or different. In one embodiment, the PCLA grid can be used as a "cache" for a set of image operations such as shape detection, contour following, form matching, all using the output of the PCA or externally, Execute at a resolution that may differ from the data received.

而且,如可以对其施加或存储输入(例如PCA的输出)且从其恢复结果的存储器阵列那样,各个PCLA元件能够被存取。增加的功能可能来自于在近邻之间传送位的能力(相似于移位操作)或向蜂窝逻辑网格的任意位置传送位的能力(随机存取传送)。Also, individual PCLA elements can be accessed as a memory array to which an input (such as the output of a PCA) can be applied or stored, and from which a result can be retrieved. Added functionality may come from the ability to transfer bits between neighbors (similar to a shift operation) or to an arbitrary location on the cellular logical grid (random access transfer).

图23的实施方案示出了可编程蜂窝逻辑阵列(PCLA)元件的一种简单的装置。在图23所示的实施方案中,仅仅示出了中心单元连接以防止杂乱,同时,在所示的实施方案中,单元输出已经被分级以防止竞态条件。在此实施方案中,所有操作都是严格的局部操作。这不是对本发明的限制。在一个实施方案中,诸如与全局形式的匹配、复位、设定之类的全局操作,也能够在PCLA中实现。而且,能够以下列方法实现增加的功能:(1)增加连接,其中各个单元接收更多的输入,如从近邻的额外单元的输入,或经由存储器那样的存取从阵列的任意单元接收的输入;以及(2)增加单元存储器,其中前项操作的结果能够被存储在单元中。在一个实施方案中,单元中的逻辑应该是可编程的。实际上,在操作过程中可以多次改变程序。取样逻辑功能可以是一组可获得的输入的AND、OR、INVERT、XOR等。The embodiment of Figure 23 shows a simple arrangement of programmable cellular logic array (PCLA) elements. In the embodiment shown in Figure 23, only the central cell connections are shown to prevent clutter, while, in the embodiment shown, the cell outputs have been staged to prevent race conditions. In this embodiment, all operations are strictly local operations. This is not a limitation of the invention. In one embodiment, global operations such as match with global form, reset, set can also be implemented in PCLA. Moreover, increased functionality can be achieved in the following ways: (1) Increased connectivity, where each element receives more inputs, such as from additional elements in the immediate vicinity, or from any element of the array via memory-like access ; and (2) increase cell memory, where the result of the previous operation can be stored in the cell. In one embodiment, the logic in the cell should be programmable. In fact, the program can be changed many times during operation. The sampling logic function may be an AND, OR, INVERT, XOR, etc. of a set of available inputs.

在一个实施方案中,PCLA可能从出现的可重构场可编程门阵列(FGPA)的使用中获益。但这是一个比较新的领域,研究人员正在努力考虑利用这些器件的固有灵活性的方法,以便容易建立更好的计算范例。In one embodiment, PCLAs may benefit from the emerging use of reconfigurable field programmable gate arrays (FGPAs). But this is a relatively new field, and researchers are trying to think of ways to exploit the inherent flexibility of these devices so that better computing paradigms can be easily built.

这方面的一个称为塑性单元构造的具体概念,导致了一种新的电路类型,它被布局成由能够为具体问题动态地重构自身的完全相同的计算元件即单元组成的阵列。这一新的计算范例提供了超越普通可重构FPGA概念的新颖特点,其中,迄今已经能够仅仅通过下载到一个或多个FPGA的软件来重构电路,然后芯片直接执行规定的硬连接电路功能。这一增加的特点是一个电路动态地构成另一个电路的能力。得到的处理阵列能够模仿产生专用单元的能力,这又使PCLA那样的蜂窝阵列能够根据其近邻或其本身的输出而自构。这种等级的数据驱动性能使得能够从非常简单的规则实现非常复杂的功能。A specific concept of this, called plastic cell construction, has led to a new type of circuit laid out as an array of identical computational elements, or cells, that can dynamically reconfigure themselves for specific problems. This new computing paradigm offers novel features beyond the common concept of reconfigurable FPGAs, in which circuits have hitherto been able to be reconfigured simply by software downloaded to one or more FPGAs, and the chip then directly executes the prescribed hardwired circuit functions . This addition features the ability for one circuit to dynamically compose another. The resulting processing array can mimic the ability to generate dedicated cells, which in turn enables cellular arrays like PCLAs to self-configure based on the output of their neighbors or themselves. This level of data-driven performance enables very complex functionality from very simple rules.

PCLA过程PCLA process

图24示出了上述可编程蜂窝逻辑阵列(PCLA)的操作过程的一个实施方案。在一个实施方案中,在过程开始之前,确定必须的局部和全局逻辑功能。这些逻辑功能应该已经在需要执行的阵列的各个元件中实现了,或应该有可能对包含在阵列各个元件中的逻辑进行编程以执行所需的逻辑功能。通常,用阵列各个元件进行的操作的形式来定义操作。Figure 24 shows one embodiment of the operation process of the Programmable Cellular Logic Array (PCLA) described above. In one embodiment, the necessary local and global logic functions are determined before the process begins. These logic functions should already be implemented in the individual elements of the array to be performed, or it should be possible to program the logic contained in the individual elements of the array to perform the desired logic functions. Typically, operations are defined in terms of operations performed by individual elements of the array.

因此,程序由完成任务所需的各个操作顺序组成,使程序的各个步骤规定一个或多个待要由阵列各个元件执行的功能。在一个实施方案中,在任一给定的时间,所有元件可以执行相同的功能或不同的功能。同样,在一个实施方案中,也有可能根据一个或多个前项操作而得到下一个操作。在一个实施方案中,程序可以被外部存储或存储在实现阵列的同一个芯片上。在一个变通实施方案中,程序存储器可以是局部的,亦即,阵列的各个单元可以具有其自身的程序存储器。Thus, a program consists of the sequence of operations required to accomplish a task, such that each step of the program specifies one or more functions to be performed by each element of the array. In one embodiment, all elements may perform the same function or different functions at any given time. Likewise, in one embodiment, it is also possible to derive the next operation from one or more previous operations. In one embodiment, the program can be stored externally or on the same chip that implements the array. In an alternative embodiment, the program memory may be local, that is, each element of the array may have its own program memory.

回过来参照图24,可以通过下列过程来实现上述可编程蜂窝逻辑阵列(PCLA)的操作过程的实施方案:Referring back to FIG. 24 , the embodiment of the operation process of the above-mentioned programmable cellular logic array (PCLA) can be realized through the following process:

1.开始1. start

2.若完成前项操作或若需要新的数据,则取得新的输入。这些输入可以来自从建立在阵列中的传感器连续地或分立地取样的数据,或可以利用适当的扫描装置从包括但不局限于PCA的外部数据源扫描到输入节点中。2. If the previous operation is completed or if new data is required, new input is obtained. These inputs may come from continuously or discretely sampled data from sensors built into the array, or may be scanned into the input nodes from external data sources including but not limited to PCA using suitable scanning means.

3.进行规定的操作,可以从外部源或从内部存储的程序存储器中,施加规定操作的操作码。3. To perform a prescribed operation, the opcode for the prescribed operation may be applied from an external source or from internally stored program memory.

4.使时间足以完成计算,亦即时间足以使电路达到阵列的输出数值的改变等于0的状态。有源传感器的输出应该首先被数字化。4. Make the time sufficient to complete the calculation, ie the time sufficient for the circuit to reach a state where the change in the output value of the array is equal to zero. The output of active sensors should first be digitized.

5.读取并存储阵列的输出。此输出可以被外部或内部存储。若PCLA被用于具有PCA的级联中,则输出也可以被馈送到PCA。5. Read and store the output of the array. This output can be stored externally or internally. If PCLA is used in cascade with PCA, the output can also be fed to PCA.

6.等待下一个操作。6. Wait for the next operation.

7.若程序终止,则到结束。7. If the program terminates, go to the end.

8.到(2)。8. To (2).

9.结束。9. End.

PCA与PCLA之间的通信Communication between PCA and PCLA

显然,除了电源和接地信号外,在芯片上通常还需要分布许多其它的数据、控制、和地址信号。如在许多光传感器芯片的情况下那样,芯片的PCA部分应该用仅仅在光敏区域处具有窗口的金属层覆盖,以便使封装件暴露于光。通常可以使覆盖整个芯片的金属层接地。相似的层也可以被用于双重构造的PCLA部分上,以便将PCA的输出传送到逻辑阵列的所有点。Obviously, in addition to power and ground signals, many other data, control, and address signals typically need to be distributed on the chip. As is the case with many photosensor chips, the PCA portion of the chip should be covered with a metal layer with windows only at the photosensitive areas in order to expose the package to light. It is usually possible to ground the metal layer covering the entire chip. Similar layers can also be used on the PCLA portion of the double construct to route the output of the PCA to all points of the logic array.

PCA的行选择和列选择信号使人想起存储器寻址图。仅仅当二个信号都是逻辑高时,特定的象素才被选择。这些信号可以被用来将权重加载到单元。The PCA's row select and column select signals are reminiscent of a memory addressing diagram. A particular pixel is selected only when both signals are logic high. These signals can be used to load weights to cells.

PCLA的元件也能够像寻址存储器单元那样被寻址。单元行可以被想象成长字。一组写入和移位操作可以代替越过PCLA发送多路地址信号的需要。Elements of PCLA can also be addressed like memory cells. A cell line can be thought of as a long word. A set of write and shift operations can replace the need to send multiple address signals across the PCLA.

PCA与PCLA级联过程PCA and PCLA cascade process

图25示出了采用级联可编程卷积阵列(PCA)和可编程蜂窝逻辑阵列(PCLA)的操作过程的一个实施方案。在一个实施方案中,在过程开始之前,确定待要施加到要执行的所有操作的PCA的必须的权重形式。同样,确定PCLA需要的必须的局部和全局逻辑功能。在一个实施方案中,由PGA的权重形式定义操作,并由PCLA的各个元件来进行逻辑功能形式。Figure 25 shows one embodiment of a process using a cascaded programmable convolutional array (PCA) and programmable cellular logic array (PCLA). In one embodiment, the necessary form of weights to be applied to the PCA for all operations to be performed is determined before the process begins. Also, determine the necessary local and global logic functions required by the PCLA. In one embodiment, the operations are defined by the weighted form of the PGA and performed by the individual elements of the PCLA in the logical functional form.

因此,在一个实施方案中,程序由完成任务所需的各个操作顺序组成,使程序的各个步骤规定一个或多个待要由PCA或PCLA或二者执行的操作。在一个实施方案中,程序可以被外部存储,或存储在同一个芯片或实现二个阵列的多芯片模块上。在一个变通实施方案中,程序存储器可以是中心的或局部的,亦即,阵列的各个单元可以具有其自身的程序存储器。Thus, in one embodiment, a program consists of the sequence of individual operations required to accomplish a task, such that each step of the program specifies one or more operations to be performed by PCA or PCLA, or both. In one embodiment, the program can be stored externally, or on the same chip or multi-chip module implementing the two arrays. In an alternative embodiment, the program memory may be central or local, ie each element of the array may have its own program memory.

回过来参照图25,可以通过下列过程来实现采用级联可编程卷积阵列(PCA)和可编程蜂窝逻辑阵列(PCLA)的操作过程的实施方案:Referring back to FIG. 25, an embodiment of an operating process employing a cascaded Programmable Convolution Array (PCA) and Programmable Cellular Logic Array (PCLA) may be achieved by the following process:

1.开始1. start

2.若完成前项操作或若需要新的数据,则取得新的输入。这些输入可以来自从建立在一个或多个阵列中的传感器连续地或分立地取样的数据,或可以利用适当的扫描装置从包括但不局限于从一个阵列到另一个阵列的外部数据源扫描到输入节点中。2. If the previous operation is completed or if new data is required, new input is obtained. These inputs may come from data sampled continuously or discretely from sensors built into one or more arrays, or may be scanned using suitable scanning devices from external sources including, but not limited to, from one array to another. input node.

3.进行规定的操作。可以从外部源或从内部程序存储器中,施加规定操作的操作码。3. Carry out the prescribed operation. Opcodes specifying operations can be applied from an external source or from internal program memory.

4.使时间足以完成计算,亦即时间足以使电路达到阵列的输出数值的改变等于0的状态或操作被认为完成的某些其它判据。4. Sufficient time to complete the computation, ie sufficient time for the circuit to reach a state where the change in the output value of the array is equal to 0 or some other criterion by which the operation is considered complete.

5.读取并存储阵列的输出。此输出可以被外部或内部存储。此输出也可以从一个阵列被馈送到另一个阵列。5. Read and store the output of the array. This output can be stored externally or internally. This output can also be fed from one array to another.

6.等待下一个操作。6. Wait for the next operation.

7.若程序终止,则到结束。7. If the program terminates, go to the end.

8.到(2)。8. To (2).

9.结束。9. End.

应用实施例:单片指纹锁定控制Application Example: Single Chip Fingerprint Lock Control

本发明的概念可以被应用于指纹识别系统。目前的指纹系统需要传感器、存储器、处理器、和软件加载,这又限制了其实际使用。但单芯片有数不清的应用,能够不用外部处理器、存储器、或程序而完成同样的任务。例如,具有指纹钥匙的门锁、抽屉锁或电源按钮锁就是例子。闭锁一直保持到“受权的”指纹按在锁上为止。The concept of the present invention can be applied to fingerprint identification systems. Current fingerprint systems require sensors, memory, processors, and software loading, which again limits their practical use. But there are countless applications for a single chip that can accomplish the same task without an external processor, memory, or program. For example, door locks with fingerprint keys, drawer locks or power button locks are examples. The lock is maintained until an "authorized" fingerprint is pressed against the lock.

在此实施例中,由于指纹图象是二进制的,故用二进制传感器装配PCLA并独立使用它是实际的。用来控制图象并确定是否正确的指纹的逻辑程序,可以直接建立在器件中。作为变通,若仅仅指纹图象的线性转换就足够,则可以建立单独的PCA。程序可以是权重的组合,可以存储在器件上。作为另一种选择,使用PCLA和PCA二者可以增强特征,提高分辨率,即提供更好的安全性。In this embodiment, since the fingerprint image is binary, it is practical to equip the PCLA with a binary sensor and use it independently. The logic used to control the image and determine if it is the correct fingerprint can be built directly into the device. Alternatively, a separate PCA can be built if only a linear transformation of the fingerprint image is sufficient. A program can be a combination of weights and can be stored on the device. Alternatively, using both PCLA and PCA can enhance features, improve resolution, ie provide better security.

计算机环境computer environment

图26和上述描述是用来提供可以实现本发明的适当计算环境的一般描述。虽然不一定必要,但本发明的一个实施方案可以被实现为一组一般组织的计算机可执行指令,例如诸如委托工作站或服务器之类的计算机执行的程序模块。通常,程序模块包括执行特定任务或实现特定提取数据类型的信道、程序、目标、部件、数据结构之类。Figure 26 and the foregoing description are intended to provide a general description of a suitable computing environment in which the invention may be implemented. Although not required, an embodiment of the invention can be implemented as a generally organized set of computer-executable instructions, such as program modules, executed by a computer such as a commissioned workstation or server. Generally, program modules include channels, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstracted data types.

而且,本技术领域熟练人员可以理解的是,可以用包括手提装置、多处理器系统、基于微处理器的或可编程的用户电子器件、通信器件、网络个人计算机、小型计算机、大型计算机之类的其它计算机系统配置,来实施本发明。本发明还可以在分布计算环境中实施,其中用通过通信网络连接的遥控处理器件来执行任务。在一种分布计算环境中,程序模块可以位于本地存储器存储器件和远方存储器存储器件二者中。Moreover, those skilled in the art will understand that, including handheld devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, communication devices, network personal computers, minicomputers, mainframe computers, etc. other computer system configurations to implement the present invention. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.

如图26所示,示例性通用计算系统可以包括常规个人计算机20之类,它包括处理器21、系统存储器22、以及将包括系统存储器22的各种各样的系统元件耦合到处理器21的系统总线23。系统总线23可以是采用各种总线构造中任何一种结构的几种总线,包括存储器总线或存储器控制器、外围总线、和本地总线。系统存储器22可以包括只读存储器(ROM)24和随机存取存储器(RAM)25。包含例如在启动过程中协助在个人计算机20中的各个元件之间传送信息的基本信道的基本输入/输出系统26(BIOS),可以被存储在ROM 24中。个人计算机20还可以包括用来从硬盘(来示出)读取或对其进行写入的硬盘驱动器27、用来从可拆卸磁盘29读取或对其进行写入的磁盘驱动器28、以及用来从诸如CD-ROM或其它光学媒质之类的可拆卸光盘31读取或进行写入的光盘驱动器30。硬盘驱动器27、磁盘驱动器28、和光盘驱动器30,可以被硬盘驱动器接口32、磁盘驱动器接口33、和光盘驱动器接口34分别连接到系统总线23。各个驱动器及其相关的计算机可读媒质,为个人计算机20提供了计算机可读指令、数据结构、程序模块、和其它数据的非易失存储。As shown in FIG. 26 , an exemplary general purpose computing system may include, for example, a conventional personal computer 20 that includes a processor 21, a system memory 22, and a system coupling various system elements including the system memory 22 to the processor 21. system bus 23 . System bus 23 can be of any of several bus architectures, including a memory bus or memory controller, a peripheral bus, and a local bus. System memory 22 may include read only memory (ROM) 24 and random access memory (RAM) 25 . A basic input/output system 26 (BIOS), including the basic channels that assist in the transfer of information between various components in the personal computer 20, such as during start-up, may be stored in ROM 24. The personal computer 20 may also include a hard disk drive 27 for reading from or writing to a hard disk (not shown), a magnetic disk drive 28 for reading from or writing to a removable magnetic disk 29, and An optical disc drive 30 to read from or write to a removable optical disc 31 such as a CD-ROM or other optical media. Hard disk drive 27, magnetic disk drive 28, and optical disk drive 30 may be connected to system bus 23 by hard disk drive interface 32, magnetic disk drive interface 33, and optical drive interface 34, respectively. The drives and their associated computer-readable media provide nonvolatile storage of computer-readable instructions, data structures, program modules, and other data for the personal computer 20 .

虽然此处描述的示例性实施方案可以使用硬盘、可拆卸磁盘29、和可拆卸光盘31、或它们的组合,但本技术领域熟练人员应该理解的是,也可以在示例性操作环境中使用能够存储计算机可访问的数据的其它类型的计算机可读媒质,诸如磁卡、快速存储器卡、数字视盘、贝努里磁带、随机存取存储器(RAM)、只读存储器(ROM)之类。Although the exemplary embodiments described herein may use hard disks, removable magnetic disks 29, and removable optical disks 31, or combinations thereof, those skilled in the art will appreciate that capable Other types of computer-readable media that store computer-accessible data, such as magnetic cards, flash memory cards, digital video disks, Bernoulli tapes, random access memory (RAM), read only memory (ROM), and the like.

包括操作系统35、一个或多个应用程序36、其它程序模块37、以及程序数据38的大量程序模块,可以被存储在硬盘、磁盘29、光盘31、ROM 24、或RAM 25上。用户可以通过诸如键盘40和指示器件42之类的输入器件,将命令和信息输入到个人计算机20中。其它的输入器件(未示出)可以包括一个或多个麦克风、游戏杆、游戏垫、辅助盘、扫描仪之类。这些和其它的输入器件常常通过耦合到系统总线23的串行接口46,被连接到处理器21,但可以用诸如并行接口、游戏接口、或万能串行总线(USB)之类的其它接口来连接。监视器47或其它类型的显示器也可以通过视频适配器48之类的接口,被连接到系统总线23。除了监视器47外,个人计算机通常可以包括其它的外围输出器件(未示出),例如扬声器和打印机。A large number of program modules, including operating system 35, one or more application programs 36, other program modules 37, and program data 38, may be stored on a hard disk, magnetic disk 29, optical disk 31, ROM 24, or RAM 25. A user may enter commands and information into the personal computer 20 through input devices such as a keyboard 40 and pointing device 42 . Other input devices (not shown) may include one or more microphones, joysticks, game pads, accessory disks, scanners, and the like. These and other input devices are usually connected to the processor 21 through a serial interface 46 coupled to the system bus 23, but other interfaces such as a parallel interface, a game interface, or a universal serial bus (USB) may be used for connect. A monitor 47 or other type of display may also be connected to system bus 23 through an interface such as video adapter 48 . In addition to the monitor 47, a personal computer may typically include other peripheral output devices (not shown), such as speakers and a printer.

采用到一个或多个远方计算机例如49的逻辑连接,个人计算机20可以在网络环境中运行。虽然在图26中仅仅示出了存储器存储器件50,但远方计算机49可以是另一台个人计算机、服务器、信道器、网络PC、同级器件、或其它公共网络节点,且通常包括许多或所有有关个人计算机20的上述元件。图26所示的逻辑连接包括局域网(LAN)51和宽带网(WAN)52。这种网络环境在办公室、企业计算机网络、互连网、和内部网中是平淡无奇的。The personal computer 20 may operate in a network environment using logical connections to one or more remote computers such as 49 . Although only memory storage device 50 is shown in FIG. 26, remote computer 49 may be another personal computer, server, channelizer, network PC, peer device, or other public network node, and typically includes many or all Regarding the above-mentioned elements of the personal computer 20 . The logical connections shown in FIG. 26 include a local area network (LAN) 51 and a broadband network (WAN) 52 . Such networking environments are commonplace in offices, enterprise computer networks, the Internet, and intranets.

当在LAN网络环境中使用时,个人计算机20通过网络接口或适配器53,被连接到LAN 51。当在WAN网络环境中使用时,个人计算机20通常包括调制解调器54或用来在互连网之类的宽带网络52上建立通信的其它装置。调制解调器54可以是内部的或外部的,通过串行接口46被连接到系统总线23。在网络环境中,相对于个人计算机20所述的程序模块,或其一部分,可以被存储在远方存储器存储器件中。可以理解的是,所示的网络连接是示例性的,也可以使用建立计算机之间通信连接的其它的装置。When used in a LAN network environment, the personal computer 20 is connected to the LAN 51 through a network interface or adapter 53. When used in a WAN network environment, the personal computer 20 typically includes a modem 54 or other device for establishing communications over a broadband network 52 such as the Internet. Modem 54 , which may be internal or external, is connected to system bus 23 through serial interface 46 . In a network environment, program modules depicted relative to the personal computer 20, or portions thereof, may be stored in the remote memory storage device. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.

还要理解的是,不同的元件或组成部分可以包括或排除于一般的计算环境,或被组合,以实现所附权利要求所定义的本发明的概念。It is also to be understood that different elements or components may be included or excluded from the general computing environment, or combined, to implement the inventive concept as defined by the appended claims.

网络环境Web environment

如所指出的那样,上述通用计算机能够被用作计算机网络的一部分。通常,上面的描述适用于网络环境中使用的伺服计算机和委托计算机二者。图27示出了一种这样的示例性网络环境,其中可以使用本发明。如图27所示,通过通信网络160(可以是LAN、WAN、互连网、或内部网),大量服务器10a、10b等被互连到大量委托计算机20a、20b、20c等。在通信网络160是例如互连网的网络环境中,服务器10可以是Web服务器,委托计算机20利用它,通过例如超级文本传送协议(HTTP)之类的大量已知协议进行通信。各个委托计算机20可以配备有用来访问服务器10的浏览器180和委托应用软件185。如图27的实施方案所示,服务器10a包括动态数据库12或被耦合到动态数据库12。As noted, the above-described general-purpose computers can be employed as part of a computer network. In general, the above description applies to both server and client computers used in network environments. Figure 27 illustrates one such exemplary network environment in which the present invention may be used. As shown in FIG. 27, a large number of servers 10a, 10b, etc. are interconnected to a large number of client computers 20a, 20b, 20c, etc., via a communication network 160 (which may be a LAN, WAN, the Internet, or an intranet). In a network environment where communication network 160 is, for example, the Internet, server 10 may be a Web server by which client computer 20 communicates via a number of known protocols, such as Hypertext Transfer Protocol (HTTP). Each client computer 20 may be equipped with a browser 180 and a client application software 185 for accessing the server 10 . As shown in the embodiment of FIG. 27 , the server 10a includes or is coupled to a dynamic database 12 .

如所示,数据库12可以包括数据库字段12a,它包含了存储在数据库12中的有关项目的信息。例如,数据库字段12a可以以各种各样的方法在数据库中被构造。可以用连接清单、多维数据阵列、散列表之类来构成字段12a。这通常是根据容易实现、自由存储器数量、待要存储的数据的特性、数据库要频繁写入还是大部分是被读出等等的一种设计选择。普通的字段12a被示于左侧。如所示,一个字段通常具有包含有关字段的各种信息的子字段,例如ID或标题子字段、项目类型子字段、包含特性的子字段、等等。这些数据库字段12a仅仅是为了说明的目的而示出的,且如所述,数据在数据库中存储的具体实现,能够根据爱好大范围地变化。As shown, database 12 may include a database field 12a containing information stored in database 12 about an item. For example, database field 12a can be structured in a database in a variety of ways. The field 12a may be constituted by a link list, a multidimensional data array, a hash table, or the like. This is usually a design choice based on ease of implementation, amount of free memory, nature of the data to be stored, whether the database will be written frequently or mostly read, etc. A common field 12a is shown on the left. As shown, a field typically has subfields containing various information about the field, such as an ID or title subfield, an item type subfield, a subfield containing attributes, and so on. These database fields 12a are shown for illustrative purposes only, and as stated, the specific implementation of data storage in the database can vary widely according to preference.

于是,在具有用来访问网络并与网络发生相互作用的委托计算机和用来与委托计算机发生相互作用并与存储有存储字段的数据库通信的伺服计算机的计算机网络环境中,能够实现本发明。同样,能够用各种各样的基于网络的结构来实现本发明,因而不应该局限于所示的实施例。Thus, the present invention can be implemented in a computer network environment having a client computer for accessing and interacting with the network and a server computer for interacting with the client computer and communicating with a database storing storage fields. Likewise, the invention can be implemented in a wide variety of network-based architectures and thus should not be limited to the illustrated embodiments.

从上面的描述和附图中,本技术领域的一般熟练人员可以理解的是,所示的特定实施方案仅仅是为了说明的目的,而不是用来限制本发明的范围。本技术领域的普通技术人员可以理解的是,本发明可以以其它的具体形式来体现,而不偏离其构思或主要特征。对具体实施方案细节的参照,不是为了限制权利要求的范围。From the foregoing description and accompanying drawings, those of ordinary skill in the art will appreciate that the particular embodiments shown are for illustration purposes only and are not intended to limit the scope of the invention. Those skilled in the art can understand that the present invention can be embodied in other specific forms without departing from the concept or main characteristics thereof. References made to details of specific embodiments are not intended to limit the scope of the claims.

Claims (3)

1. integrated sensor spare, it comprises:
By the array that the sensing processor unit that can be arranged in detection array is formed, each sensing processor unit comprises:
The sensing medium;
The trsanscondutance amplifier of at least one model multiplication that is configured to feedover;
At least one is configured to feed back the trsanscondutance amplifier of model weight;
A plurality of local dynamic storage units;
Be used for the data bus that data transmit;
The local logic unit; And
Sensing processor unit array wherein is by means of the response data control signal, can and be modulated into the various statements that comprise (and expansion) conventional space and temporary transient treatment conversion with image conversion, the shaping of original sensing.
2. integrated sensor processing unit device, it comprises:
Be configured to produce the sensing medium of the pixel output of sign;
Be configured to store at least one storage component part of weight position, wherein at least one storage component part is configured to store the pixel output of sign;
The a plurality of multipliers relevant with at least one storage component part work;
With at least one the relevant trsanscondutance amplifier of at least one work in a plurality of multipliers, this at least one trsanscondutance amplifier is used to the operation in having the unit follower structure of variable gain;
The relevant many input logic gates of at least one described storage component part with the pixel output that is used for storing sign; And
Integrated sensor processing unit wherein is by means of the response data control signal, can and be modulated into the various statements that comprise (and expansion) conventional space and temporary transient treatment conversion with image conversion, the shaping of original sensing.
3. integrated sensing and image device, it comprises:
By the array that the sensing processor unit that can be arranged in detection array is formed, each described sensing processor unit comprises:
But be used for producing the sensing medium of the signal output of electrical representation;
Quantize the device of described sensor output,
Store the device of described sensor output,
Be arranged in a plurality of programmable logic elements on the honeycomb grid, wherein each logic element can receive input from the output and the output of himself of neighbour unit; And
Integration imaging device wherein is by means of response inside or external data control signal, can and be modulated into the various statements that comprise (and expansion) conventional space and temporary transient treatment conversion with conversion of signals, the shaping of original sensing.
CN00807117A 1999-03-05 2000-03-06 Two architectures for integrated realization of sensing and processing in a single device Pending CN1457471A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12317799P 1999-03-05 1999-03-05
US60/123,177 1999-03-05

Publications (1)

Publication Number Publication Date
CN1457471A true CN1457471A (en) 2003-11-19

Family

ID=22407147

Family Applications (1)

Application Number Title Priority Date Filing Date
CN00807117A Pending CN1457471A (en) 1999-03-05 2000-03-06 Two architectures for integrated realization of sensing and processing in a single device

Country Status (7)

Country Link
EP (1) EP1175659A2 (en)
JP (1) JP2002538557A (en)
CN (1) CN1457471A (en)
AU (1) AU3616700A (en)
CA (1) CA2364182A1 (en)
HK (1) HK1039992A1 (en)
WO (1) WO2000052639A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539178A (en) * 2020-04-26 2020-08-14 成都市深思创芯科技有限公司 Chip layout design method and system based on neural network and manufacturing method
CN111983629A (en) * 2020-08-14 2020-11-24 西安应用光学研究所 Linear array signal target extraction device and extraction method
CN112116099A (en) * 2017-04-09 2020-12-22 英特尔公司 Machine learning sparse computation mechanism

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6278377B1 (en) 1999-08-25 2001-08-21 Donnelly Corporation Indicator for vehicle accessory
US7447320B2 (en) 2001-02-14 2008-11-04 Gentex Corporation Vehicle accessory microphone
CA2387125C (en) 1999-11-19 2011-10-18 Gentex Corporation Vehicle accessory microphone
US7120261B1 (en) 1999-11-19 2006-10-10 Gentex Corporation Vehicle accessory microphone
AU2002250080A1 (en) 2001-02-14 2002-08-28 Gentex Corporation Vehicle accessory microphone
ES2209642B1 (en) * 2002-11-04 2005-10-01 Innovaciones Microelectronicas, S.L. MIXED SIGNAL PROGRAMMED INTEGRATED CIRCUIT ARCHITECTURE FOR THE PERFORMANCE OF AUTONOMOUS VISION SYSTEMS OF A SINGLE CHIP AND / OR PRE-PROCESSING OF IMAGES IN HIGHER LEVEL SYSTEMS.
CN111368253B (en) * 2018-12-26 2023-09-26 兆易创新科技集团股份有限公司 A convolution operation method and device based on non-volatile memory

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5140670A (en) * 1989-10-05 1992-08-18 Regents Of The University Of California Cellular neural network
US5355528A (en) * 1992-10-13 1994-10-11 The Regents Of The University Of California Reprogrammable CNN and supercomputer

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112116099A (en) * 2017-04-09 2020-12-22 英特尔公司 Machine learning sparse computation mechanism
US12141891B2 (en) 2017-04-09 2024-11-12 Intel Corporation Machine learning sparse computation mechanism
CN111539178A (en) * 2020-04-26 2020-08-14 成都市深思创芯科技有限公司 Chip layout design method and system based on neural network and manufacturing method
CN111983629A (en) * 2020-08-14 2020-11-24 西安应用光学研究所 Linear array signal target extraction device and extraction method
CN111983629B (en) * 2020-08-14 2024-03-26 西安应用光学研究所 Linear array signal target extraction device and extraction method

Also Published As

Publication number Publication date
JP2002538557A (en) 2002-11-12
HK1039992A1 (en) 2002-05-17
WO2000052639A2 (en) 2000-09-08
AU3616700A (en) 2000-09-21
CA2364182A1 (en) 2000-09-08
WO2000052639A3 (en) 2001-02-15
EP1175659A2 (en) 2002-01-30

Similar Documents

Publication Publication Date Title
Khaddam-Aljameh et al. HERMES-core—A 1.59-TOPS/mm 2 PCM on 14-nm CMOS in-memory compute core using 300-ps/LSB linearized CCO-based ADCs
CN1846218A (en) Artificial neural network
CN1457471A (en) Two architectures for integrated realization of sensing and processing in a single device
CN1627251A (en) Accelerating and optimizing the processing of machine learning techniques using a graphics processing unit
CN1156982C (en) Method for simultaneously implementing A/D conversion and multiplication
CN1537297A (en) Architecture for discrete wavelet transform
CN1696959A (en) Specific subject detection device
CN1674640A (en) Noise-amount estimate apparatus, noise-amount estimate method
CN1310825A (en) Methods and apparatus for classifying text and for building a text classifier
CN1692341A (en) Information processing device and method, program, and recording medium
Woo et al. A cytomorphic chip for quantitative modeling of fundamental bio-molecular circuits
Horváth et al. Cellular neural network friendly convolutional neural networks—CNNs with CNNs
CN1545049A (en) Large-Scale Mixed-Mode Layout Method Based on Virtual Modules
CN1076536A (en) Improved analog digital neuron, neural network and network debugging algorithm
Sekanina et al. Approximate Circuits in Low-Power Image and Video Processing: The Approximate Median Filter.
CN1256700C (en) Neural Network Classification System and Method Based on N-tuple or Random Access Memory
US6768515B1 (en) Two architectures for integrated realization of sensing and processing in a single device
CN1306694A (en) Adaptive state space signal separation, discrimination and recovery architectures and their adaptations for use in dynamic environments
CN1043579A (en) True value generation basic circuit and true value generation circuit
Bai et al. A compilation framework for SRAM computing-in-memory systems with optimized weight mapping and error correction
CN1375968A (en) Information processing apparatus and method, recording medium product and programme
CN1153164C (en) A method for generating optimal number of cuts in virtual multi-media capacitance extraction
CN101048843A (en) Two-dimensional motion sensor
CN1313918C (en) Method and device for base transfer in finite extent
CN1581143A (en) Information processing apparatus and method, program storage medium and program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication