[go: up one dir, main page]

CN105844330B - The data processing method and neural network processor of neural network processor - Google Patents

The data processing method and neural network processor of neural network processor Download PDF

Info

Publication number
CN105844330B
CN105844330B CN201610165618.2A CN201610165618A CN105844330B CN 105844330 B CN105844330 B CN 105844330B CN 201610165618 A CN201610165618 A CN 201610165618A CN 105844330 B CN105844330 B CN 105844330B
Authority
CN
China
Prior art keywords
input data
data
neural network
nonlinear mapping
absolute value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610165618.2A
Other languages
Chinese (zh)
Other versions
CN105844330A (en
Inventor
费旭东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201610165618.2A priority Critical patent/CN105844330B/en
Publication of CN105844330A publication Critical patent/CN105844330A/en
Application granted granted Critical
Publication of CN105844330B publication Critical patent/CN105844330B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Complex Calculations (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the present invention provides the data processing method and neural network processor of a kind of neural network processor.This method comprises: input data is added with corresponding weight absolute value by adder, input data is the data of previous stage output, input data and weight absolute value are n member vector, and input data and corresponding weight absolute value n item data after being added are successively carried out the first Nonlinear Mapping of n times.Result after first Nonlinear Mapping is subjected to n times accumulation operations by accumulator, accumulation operations include the add operation and subtraction operation of weight sign bit control, the second Nonlinear Mapping of result progress after n times accumulation operations is obtained into processing result and carries out data output, the inverse mapping of rule and the first Nonlinear Mapping that the second Nonlinear Mapping is mapped according to Neural Network Based Nonlinear is formulated.To improve quantitative efficiency, storage demand and the bandwidth demand of data are reduced.

Description

Data processing method of neural network processor and neural network processor
Technical Field
The embodiment of the invention relates to the technical field of electronic chips, in particular to a data processing method of a neural network processor and the neural network processor.
Background
Neural networks and deep learning algorithms have been used with great success and are in the process of rapid development, and it is widely expected in the industry that new computing methods will help to realize more common and complex intelligent applications. In recent years, a neural network and a deep learning algorithm have achieved a very outstanding achievement in the field of image recognition application, so that the industry has attracted attention and paid attention to optimization and high-efficiency implementation of the neural network and the deep learning algorithm, and companies such as facebook, Qualcomm, baidu, google and the like have all invested in research on the neural network optimization algorithm. Qualcomm issues plans to integrate neural network processing modules in next-generation chips, and it is the core issue of attention and research to improve the processing efficiency of neural network algorithms, improve related algorithms, and the efficiency of chip implementation.
FIG. 1 is a schematic diagram of an n-level (layer) neural network computational model, wherein the neural network processes the computational form of one neuron: and y ═ f (x1 × w1+ x2 × w2+ … + xn × wn + b), the calculation is performed in stages, and the output of the previous stage is the input of the next stage. Fig. 2 is a flow chart of a conventional calculation method, in which preceding stage outputs are used as data inputs (x1, x2, … xn), x1, x2, … xn are multiplied by corresponding weight parameters, an accumulator is used to complete the accumulation operation of x1 w1+ x2 w2+ … + xn wn + b, and then a non-linear mapping y ═ f (the accumulated result) is used to obtain a calculation result, and finally data output is completed.
It can be seen that, in the above data processing method, because the computation complexity of the multiplication involved is relatively high, under a certain computation accuracy requirement, the storage requirement and bandwidth requirement of the corresponding data are also relatively high, and the computation efficiency is not high.
Disclosure of Invention
The embodiment of the invention provides a data processing method of a neural network processor and the neural network processor, and aims to solve the problems of high data storage requirement and bandwidth requirement and low calculation efficiency in the existing processing method.
In a first aspect, an embodiment of the present invention provides a data processing method for a neural network processor, where the method includes: firstly, adding input data and corresponding weight absolute values by an adder, wherein the input data is output data of a previous stage, and the input data and the weight absolute values are n-element vectors. And then sequentially carrying out n times of first nonlinear mapping on the input data and the n items of data obtained by adding the corresponding weight absolute values. And then the result after the first nonlinear mapping is subjected to n times of accumulation operations through an accumulator, wherein the accumulation operations comprise addition operations and subtraction operations controlled by the sign bit of the weight. And finally, performing second nonlinear mapping on the result obtained after the n times of accumulation operation to obtain a processing result and outputting data, wherein the second nonlinear mapping is formulated according to the rule of the nonlinear mapping of the neural network and the inverse mapping of the first nonlinear mapping. Therefore, the complex multiplication calculation is converted into the addition calculation, the quantization efficiency is improved, and the storage capacity and the bandwidth can be compressed, so that the storage requirement and the bandwidth requirement of data are reduced, and the calculation efficiency is improved. And the input data is not limited to 0/1 binary quantization, so that the calculation precision meets the requirement of an actual application network, and the method can be applied to a wider range of application targets besides neural network calculation.
In one possible design, the first non-linear mapping is an M-th power transform of 2, where M is the term in n terms of the input data plus the corresponding absolute value of the weight. The M power transformation of 2 is adopted, and the hardware implementation cost is low due to the simple mapping relation.
In one possible design, when the first non-linear mapping is an M-th power transformation of M, and M is not equal to 2, to simplify circuit implementation, the method further includes, before adding the input data and the corresponding weight absolute value by the adder, converting the M-th power transformation of M into the M-th power transformation of 2: multiplying input data by a scaling factor K1And/or, multiplying the absolute value of the weight by a scaling factor K2,K1And K2Equal or unequal; or, after the input data and the corresponding weight absolute value are added by the adder, the method further includes: multiplying each item in the n items of data after addition by a proportionality coefficient K3. Wherein, K1、K2、K3Not equal to 0.
In one possible design, K1、K2、K3Is 1+1/2NOr 1-1/2N
In one possible design, when the absolute value of the input data or weight is equal to 0, the accumulation operation remains unchanged for the current accumulation term. When the sign bit of the weight is negative, the accumulation operation is a subtraction operation. The accumulator is in a maintenance state, and since a large number of 0 exists in the actual neural network calculation process regardless of the number input data or the weight, the processing can be simplified, and the power consumption can be reduced.
In one possible design, the first or second non-linear mapping or accumulation operation is implemented by analog circuitry. The analog nonlinear conversion and the addition can be realized instantaneously without depending on the speed of a digital clock.
In a second aspect, an embodiment of the present invention provides a neural network processor, including: and the addition circuit is used for adding input data and corresponding weight absolute values, the input data is data output by a previous stage, and the input data and the weight absolute values are n-element vectors. And the first nonlinear mapping circuit is used for sequentially carrying out n times of first nonlinear mapping on the n items of data obtained by adding the input data and the corresponding weight absolute values. And the accumulation circuit is used for carrying out accumulation operation on the result after the first nonlinear mapping for n times, and the accumulation operation comprises addition operation and subtraction operation controlled by the sign bit of the weight. And the second nonlinear mapping circuit is used for carrying out second nonlinear mapping on the result obtained after the n times of accumulation operation to obtain a processing result and outputting data, and the second nonlinear mapping is formulated according to the rule of the nonlinear mapping of the neural network and the inverse mapping of the first nonlinear mapping.
In one possible design, the first non-linear mapping circuit is a power of 2M transformation circuit, and M is each item of n items of data after the input data and the corresponding weight absolute value are added.
In one possible design, the method further comprises: a first multiplier circuit for adding the input data and the corresponding dataMultiplying the input data by a scaling factor K before adding the absolute values of the weights1(ii) a And/or a second multiplying circuit for multiplying the weight absolute value by a scaling factor K before the adding circuit adds the input data and the corresponding weight absolute value2,K1And K2Equal or unequal; or, a third multiplying circuit for multiplying each of the n items of data after addition by the scaling factor K after the addition circuit adds the input data and the corresponding weight absolute value3(ii) a Wherein, K1、K2、K3Not equal to 0.
In one possible design, K1、K2、K3Is 1+1/2NOr 1-1/2N
In one possible design, when the absolute value of the input data or weight is equal to 0, the accumulation operation is that the current accumulation item remains unchanged; when the sign bit of the weight is negative, the accumulation operation is a subtraction operation.
The beneficial effects of the neural network processor provided in the second aspect and in each possible design of the second aspect may refer to the beneficial effects brought by the first aspect and each possible design of the first aspect, and are not described herein again.
The data processing method of the neural network processor and the neural network processor provided by the embodiment of the invention add input data and corresponding weight absolute values through an adder, the input data is data output by a previous stage, then n items of data obtained by adding the input data and the corresponding weight absolute values are sequentially subjected to n times of first nonlinear mapping, the result obtained after the first nonlinear mapping is subjected to n times of accumulation operation through an accumulator, the accumulation operation comprises addition operation and subtraction operation controlled by weight sign bits, finally the result obtained after the n times of accumulation operation is subjected to second nonlinear mapping to obtain a processing result and output the data, and the second nonlinear mapping is formulated according to the rule of the neural network nonlinear mapping and the inverse mapping of the first nonlinear mapping. Therefore, the complex multiplication calculation is converted into the addition calculation, the quantization efficiency is improved, and the storage capacity and the bandwidth can be compressed, so that the storage requirement and the bandwidth requirement of data are reduced, and the calculation efficiency is improved. And the input data is not limited to 0/1 binary quantization, so that the calculation precision meets the requirement of an actual application network, and the method can be applied to a wider range of application targets besides neural network calculation.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic diagram of an n-level (layer) neural network computational model;
FIG. 2 is a flow chart of a conventional computing method;
FIG. 3 is a flowchart of a first embodiment of a data processing method of a neural network processor according to the present invention;
FIG. 4 is a block diagram of a second embodiment of a data processing method of a neural network processor according to the present invention;
FIG. 5 is a block diagram of a data processing method of a neural network processor according to a third embodiment of the present invention;
FIG. 6 is a schematic diagram of the transformation of the 2 nd power in the third embodiment of the data processing method of the neural network processor according to the present invention;
FIG. 7 is a diagram illustrating a first embodiment of a neural network processor;
FIG. 8 is a diagram illustrating a second embodiment of a neural network processor;
fig. 9 is a schematic structural diagram of a third embodiment of the neural network processor of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments of the present invention without any creative efforts shall fall within the protection scope of the embodiments of the present invention.
The embodiment of the invention provides a data processing method of a neural network processor and the neural network processor, which can be applied to scenes that data such as image data, voice data, video data and the like need to be subjected to neural network calculation, and received data is used as input data to be subjected to neural network calculation (single-stage neural network calculation or multi-stage neural network calculation).
The neural network processor provided by the embodiment of the invention may be in a physical entity form. For example, in a cloud server application, the processing chip may be an independent processing chip, and in a terminal (e.g., a mobile phone) application, the processing chip may be a module in a terminal processor chip. The information input is from various information inputs such as voice, images, natural language and the like which need intelligent processing, and data to be subjected to neural network operation is formed through necessary preprocessing (such as sampling, analog-to-digital conversion, feature extraction and the like). The output of the information is sent to other subsequent processing modules or software, such as graphics or other representations that are understandable to be available. In the cloud application form, the processing units at the front and rear stages of the neural network processor may be assumed by other server operation units, for example, and in the terminal application environment, the processing units at the front and rear stages of the neural network processor may be completed by other parts (including sensors, interface circuits, and the like) of the software and hardware of the terminal.
Fig. 3 is a flowchart of a first embodiment of a data processing method of a neural network processor, as shown in fig. 1, the method of this embodiment may include:
and S101, adding the input data and the corresponding weight absolute value through an adder, wherein the input data is the data output by the previous stage, and the input data and the weight absolute value are n-element vectors.
For a multi-stage neural network, the output of a previous stage serves as the input of a subsequent stage. The weights include weight absolute values and weight sign bits.
S102, sequentially carrying out n times of first nonlinear mapping on the input data and n items of data obtained by adding corresponding weight absolute values.
The first non-linear mapping may be an M-th power transformation of an arbitrary base number M, and may also be in the form of y ═ a × MBAnd y is a non-linear transformation of B. Preferably, the first non-linear mapping is an M-th power of 2 transformation, M being the term in the input data plus the corresponding absolute value of the weight. The M power transformation of 2 is adopted, and the hardware implementation cost is low due to the simple mapping relation.
And S103, accumulating the result after the first nonlinear mapping for n times through an accumulator.
The accumulation operation comprises an addition operation and a subtraction operation controlled by the sign bit of the weight, namely, if the sign bit of the weight is negative, the accumulator performs the subtraction operation, the sign bit of the weight is positive, and the accumulator performs the addition operation.
And S104, performing second nonlinear mapping on the result obtained after the n times of accumulation operation to obtain a processing result and outputting data, wherein the second nonlinear mapping is formulated according to the rule of the nonlinear mapping of the neural network and the inverse mapping of the first nonlinear mapping.
Wherein, when the first non-linear mapping is M power transformation of M, and M is not equal to 2, in order to make the circuit implementation simple, the M power transformation of M is converted into 2 power transformation, specifically by: before the input data and the corresponding weight absolute value are added by the adder in S101, the method further includes:
multiplying input data by a scaling factor K1And/or, multiplying the absolute value of the weight by a scaling factor K2,K1And K2Equal or unequal; or, after the input data and the corresponding weight absolute value are added by the adder, the method further includes: multiplying each item in the n items of data after addition by a proportionality coefficient K3. Wherein, K1、K2、K3Not equal to 0, optionally K1、K2、K3May be 1+1/2NOr 1-1/2N
Through the operation, the M-power transformation of M can be converted into the M-power transformation of 2 when the circuit is realized, and the mapping relation of the M-power transformation of 2 is simple, so that the hardware realization cost is low.
Further, there are cases where the input data or the weight absolute value is equal to 0, and there are cases where the weight is also a negative number, and therefore, when the input data or the weight absolute value is equal to 0, the accumulation operation is such that the current accumulation term remains unchanged. The accumulator is in a maintenance state, and since a large number of 0 exists in the actual neural network calculation process regardless of the number input data or the weight, the processing can be simplified, and the power consumption can be reduced.
When the sign bit of the weight is negative, the accumulation operation is subtraction operation, i.e. the accumulator performs subtraction operation. Subtraction is the addition of 1 to the original number.
Alternatively, the first or second non-linear mapping or accumulation operation may be implemented by analog circuitry. The analog nonlinear conversion and the addition can be realized instantaneously without depending on the speed of a digital clock.
It should be noted that the adder, the accumulator and the multiplier mentioned above are discrete and physical circuits, and are not software modules implemented by a general-purpose CPU in the embodiment of the present invention.
The data processing method of the neural network processor provided in this embodiment is to add input data and a corresponding weight absolute value by an adder, where the input data is data output from a previous stage, then sequentially perform n times of first nonlinear mapping on n items of data obtained by adding the input data and the corresponding weight absolute value, perform n times of accumulation operation on a result obtained after the first nonlinear mapping by an accumulator, where the accumulation operation includes addition operation and subtraction operation controlled by a weight sign bit, and finally perform second nonlinear mapping on the result obtained after the n times of accumulation operation to obtain a processing result and output the processing result, where the second nonlinear mapping is formulated according to a rule of nonlinear mapping of the neural network and inverse mapping of the first nonlinear mapping. Therefore, the complex multiplication calculation is converted into the addition calculation, the quantization efficiency is improved, and the storage capacity and the bandwidth can be compressed, so that the storage requirement and the bandwidth requirement of data are reduced, and the calculation efficiency is improved. And the input data is not limited to 0/1 binary quantization, so that the calculation precision meets the requirement of an actual application network, and the method can be applied to a wider range of application targets besides neural network calculation.
The following describes in detail the processing procedure of the data processing method of the neural network processor according to the embodiment of the present invention, with reference to the formula derivation procedure and a specific embodiment.
Specifically, M-th power transformation is adopted to convert complex multiplication calculation into addition calculation, and the detailed calculation formula derivation process is as follows:
taking the calculations mentioned in the background art as examples for forms such as y ═ f (x1 x w1+ x2 x w2+ … + xn x wn + b),
let x >0, s (W) be the sign bit of W.
y=f(s(w1)eln(x1)+ln(|w1|)+s(w2)eln(x2)+ln(|w2|)+…s(wn)eln(xn)+ln(|wn|)+b)
ln(y)=ln(f(s(w1)eln(x1)+ln(|w1|)+s(w2)eln(x2)+ln(|w2|)+…s(wn)eln(xn)+ln(|wn|)+b))
Assuming that u ═ ln (y), v ═ ln (x), and c ═ ln (| w |), this equation is reformulated as:
u=ln(f(s(w1)ev1+c1+s(w2)ev2+c2+…s(wn)evn+cn+b))
if the exponential relationship is an exponent of 2 and the logarithmic relationship is a base-2 logarithm, the calculation formula is written as:
y=f(s(w1)2log 2 (x1)+log 2 (|w1|)+s(w2)2log 2 (x2)+log 2 (|w2|)+…+s(wn)2log 2 (xn)+log 2 (|wn|)…+b)
log2y=log2f(s(w1)2log 2 (x1)+log 2 (|w1|)+s(w2)2log 2 (x2)+log 2 (|w2|)+…+s(wn)2log 2 (xn)+log 2 (|wn|)…+b)
let u be log2y,v=log2x,c=log2I w I, this formula is re-expressed as:
u=log2f(s(w1)2v1+c1+s(w2)2v2+c2+…s(wn)2vn+cn+b)
let g (x) be 2(x),ff(x)=log2f (x), the calculation formula is abbreviated as:
u=ff(s(w1)g(v1+c1)+s(w2)g(v2+c2)+…s(wn)g(vn+cn)+b)
wherein g (x) corresponds to a first non-linear mapping, and ff (x) corresponds to a second non-linear mapping. Fig. 4 is a calculation block diagram of a second embodiment of the data processing method of the neural network processor of the present invention, as shown in fig. 4,the preceding stage outputs input data, the input data and the corresponding weight absolute value are n-element vectors, the input data and the corresponding weight absolute value are added through an adder to obtain n items of added data v1+ c1, v2+ c2, vn + cn, and the n items of added data v1+ c1, v2+ c2, vn + cn are subjected to n times of first nonlinear mapping (g (x) is 2 times)(x)) To obtain the result 2 after the first non-linear mappingv1+c1、2v2+c2、、、2vn+cnThen n times of accumulation operation to obtain 2v1+c1+2v2+c2+…2vn+cn+ b, if the input data or weight is 0, the current accumulation term remains unchanged. If the sign bit of the weight is negative, the accumulator performs subtraction. And finally, performing second nonlinear mapping (ff (x)) on the result obtained after the n times of accumulation operation to obtain a processing result and outputting data.
If the exponential relationship is an exponent with a base m and the logarithmic relationship is a logarithm with a base m, the calculation formula is written as:
y=f(s(w1)mlog m (x1)+log m (|w1|)+s(w2)mlog m (x2)+log m (|w2|)+…+s(wn)mlog m (xn)+log m (|wn|)…+b)
logmy=logmf(s(w1)mlog m (x1)+log m (|w1|)+s(w2)mlog m (x2)+log m (|w2|)+…+s(wn)mlog m (xn)+log m (|wn|)…+b)
let u be logmy,v=logmx,c=logmI w I, this formula is re-expressed as:
u=logmf(s(w1)mv1+c1+s(w2)mv2+c2+…+s(wn)mvn+cn+b)
u=logmf(s(w1)2(v1+c1)*log 2 m+s(w2)2(v2+c2)*log 2 m+…+s(wn)2(vn+cn)*log 2 m+b)
let g (x) be 2(x),ff(x)=logmf(x),k=log2m, then the calculation formula is abbreviated as:
u=ff(s(w1)g(c1*k+v1*k)+s(w2)g(c2*k+v2*k)+…s(wn)g(cn*k+vn*k)+b)
where g (x) corresponds to a first non-linear mapping, and ff (x) corresponds to a second non-linear mapping. As shown in fig. 5, fig. 5 is a calculation block diagram of a third embodiment of the data processing method of the neural network processor of the present invention. K in the above calculation formula is a multiplied proportionality coefficient, and M-th power transformation can be converted into 2-th M-th power transformation by multiplying the input data and the corresponding weight absolute value by the proportionality coefficient k before or after the adder adds the input data and the corresponding weight absolute value, so that the mapping relation is simple, the circuit implementation is simple, and further the hardware implementation cost is low. With reference to fig. 5, the method of the present embodiment includes:
s201, data output from the n-1 level data output unit (namely, output of a previous level) serves as input data of n levels.
And S202, the adder of the n stages adds the input data of the stage and the corresponding weight absolute value.
S203, multiplying the channel of the addition of the input data and the weight absolute value or the result of the addition of the input data and the corresponding weight absolute value by a scaling factor k, wherein the scaling factor k is a common scaling factor k, for example, 1+1/2N,1-1/2NThe scaling factor of (c).
S204, sequentially carrying out n times of first nonlinear mapping on the input data and n items of data obtained by adding corresponding weight absolute values.
The first non-linear mapping is, for example, a 2-raised power transform that can be designed simply. Fig. 6 is a schematic diagram of the transformation of the power of 2M in the third embodiment of the data processing method of the neural network processor of the present invention, as shown in fig. 6, specifically including:
s2041, in the added data, the floating point portion of the data is converted into a binary number reflecting the actual size of the floating point weight through a decoder circuit.
S2042, and completing exponential transformation mapping of the data through table lookup or a simplified combination circuit.
S2043, the data after the exponential transformation and the floating point weight are combined through a combination circuit to complete very simple 1-bit to n-bit multiplication to form a final result.
And S205, the accumulator performs accumulation operation on the accumulated items output after the first nonlinear mapping for n times. If the input data or weight is 0, the current accumulation term remains unchanged. If the sign bit of the weight is negative, the accumulator performs subtraction.
And S206, performing second nonlinear mapping on the result after the n times of accumulation operation. Generally, the second non-linear mapping is formulated according to the rules of the neural network non-linear mapping and the inverse of the first non-linear mapping. A special case of the transformation is a cascade comprising 2 transformations, 1 of which is the original non-linear transformation of the neural network, such as Sigmoid, ReLU, etc., and 2 of which is the logarithmic transformation that adjusts the data distribution law.
And S207, outputting the processing result.
For example, corresponding to one shape: and y is f (x1 w1+ x2 w2+ … + xn wn + b), the original data precision of x, w and y requires a floating point (32bit) with single precision and at least a fixed point with 16 bits, the corresponding data bandwidth is calculated in proportion to 16 bits, and a multiplier with 16 bits is required.
First, x, w, and y may be logarithmically transformed, where a is logmB. Experiments show that 8-bit quantization of transformed data can meet practical requirements, and even 6-bit and 4-bit quantization can be further achieved.
The transformed data is denoted as P.Q, where P is the portion before the decimal point, Q is the portion after the decimal point, P has P bits, Q has Q bits, and P + Q is the total number of bits of the data.
In this embodiment, the corresponding transformed x, w, y data is denoted by pv.
Therefore, the addition calculation after the logarithmic transformation is changed to:
case of no carry: px, Qx + Pw, Qw ═ (Px + Pw) · (Qx + Qw), or
Case with carry: px, Qx + Pw, Qw ═ (Px + Pw +1). (Qx + Qw-1)
Recording as follows: qt of Pt
Qt is simply multiplied by the following factor: log2m, by this multiplication, m needed to be done subsequently can be obtainedPt.QtIs converted into 2Ps.QsTo do so. And ps.qs.pt.qt.log2m。
In most cases, taking a logarithm or an exponent of 2 may satisfy the application requirements. In special cases, if the properly adjusted logarithmic relationship can better meet the requirements of the application, it can be properly adjusted by multiplying this coefficient. The adjustment range which can be satisfied by one-time addition is of the type: 1+1/2N,1-1/2NAny coefficient of (a). Assuming that Pt is 4 bits and Qt is 4 bits, in the subsequent calculation, Pt is directly sent to a decoder of 4 lines to 16 lines, and Qt is directly sent to a look-up table of 4 bits to 4 bits or a simple combination circuit reflecting the data rule of exponential mapping.
Fig. 7 is a schematic structural diagram of a first embodiment of a neural network processor of the present invention, and as shown in fig. 7, the neural network processor of the present embodiment may include: an adding circuit 11, a first non-linear mapping circuit 12, an accumulating circuit 13 and a second non-linear mapping circuit 14. The adding circuit 11 is configured to add input data and corresponding weight absolute values, where the input data is data output by a previous stage, and the input data and the weight absolute values are n-ary vectors. The first nonlinear mapping circuit 12 is configured to perform first nonlinear mapping on n items of data obtained by adding the input data and the corresponding weight absolute values n times in sequence. The accumulation circuit 13 is configured to perform n accumulation operations on the first non-linear mapped result, where the accumulation operations include an addition operation and a subtraction operation controlled by the sign bit of the weight. The second nonlinear mapping circuit 14 is configured to perform second nonlinear mapping on the result obtained after the n-time accumulation operations to obtain a processing result and output data, where the second nonlinear mapping is formulated according to a rule of nonlinear mapping of the neural network and an inverse mapping of the first nonlinear mapping.
The first nonlinear mapping circuit 12 may be an M-th power conversion circuit with an arbitrary base number M, and may also be in the form of y-a-MBAnd y is B and B. Preferably, the first nonlinear mapping circuit 12 is a power of 2M transform circuit, where M is each of n items of data after the input data and the corresponding weight absolute value are added.
Further, when the first nonlinear mapping circuit 12 is an M-th power transformation circuit of M, and M is not equal to 2, to make the circuit implementation simple, the M-th power transformation of M is transformed into an M-th power transformation of 2, specifically, fig. 8 is a schematic structural diagram of a second embodiment of the neural network processor of the present invention, as shown in fig. 8, on the basis of the neural network processor shown in fig. 7, further comprising: a first multiplying circuit 15 for multiplying the input data by a scaling factor K before the adding circuit 11 adds the input data and the corresponding weight absolute value1(ii) a And/or a second multiplying circuit 16 for multiplying the absolute value of the weight by a scaling factor K before the adding circuit 11 adds the input data and the corresponding absolute value of the weight2,K1And K2Equal or unequal.
Fig. 9 is a schematic structural diagram of a third embodiment of the neural network processor of the present invention, as shown in fig. 9, on the basis of the neural network processor shown in fig. 7, further including: a third multiplying circuit 17 for multiplying each of the n items of data after addition by the scaling factor K after the addition circuit 11 adds the input data and the corresponding weight absolute value3
Wherein, K1、K2、K3Not equal to 0. Optionally, K1、K2、K3May be 1+1/2NOr 1-1/2N
Further, when the input data or the weight absolute value is equal to 0, the accumulation operation is that the current accumulation item is kept unchanged; when the sign bit of the weight is negative, the accumulation operation is a subtraction operation. The accumulator is in a maintenance state, and since a large number of 0 exists in the actual neural network calculation process regardless of the number input data or the weight, the processing can be simplified, and the power consumption can be reduced.
The apparatus of this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 3, and the implementation principle thereof is similar, which is not described herein again.
The neural network processor provided in this embodiment adds input data and a corresponding weight absolute value by using an addition circuit, where the input data is data output from a previous stage, then the first nonlinear mapping circuit sequentially performs n times of first nonlinear mapping on n items of data obtained by adding the input data and the corresponding weight absolute value, the accumulation circuit performs n times of accumulation operation on a result obtained by the first nonlinear mapping by using an accumulator, the accumulation operation includes addition operation and subtraction operation controlled by a weight sign bit, and finally the second nonlinear mapping circuit performs second nonlinear mapping on the result obtained by the n times of accumulation operation to obtain a processing result and outputs data, and the second nonlinear mapping is formulated according to a rule of the neural network nonlinear mapping and inverse mapping of the first nonlinear mapping. Therefore, the complex multiplication calculation is converted into the addition calculation, the quantization efficiency is improved, and the storage capacity and the bandwidth can be compressed, so that the storage requirement and the bandwidth requirement of data are reduced, and the calculation efficiency is improved. And the input data is not limited to 0/1 binary quantization, so that the calculation precision meets the requirement of an actual application network, and the method can be applied to a wider range of application targets besides neural network calculation.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present invention, and are not limited thereto; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (11)

1. A data processing method of a neural network processor, comprising:
adding input data and corresponding weight absolute values through an adder, wherein the input data are output data of a previous stage, and the input data and the weight absolute values are n-element vectors;
sequentially carrying out n times of first nonlinear mapping on the input data and n items of data obtained by adding corresponding weight absolute values;
performing n times of accumulation operation on the result after the first nonlinear mapping through an accumulator, wherein the accumulation operation comprises addition operation and subtraction operation controlled by weight sign bits;
and performing second nonlinear mapping on the result obtained after the n times of accumulation operation to obtain a processing result and outputting data, wherein the second nonlinear mapping is formulated according to the rule of the nonlinear mapping of the neural network and the inverse mapping of the first nonlinear mapping.
2. The method of claim 1, wherein the first non-linear mapping is a transformation of 2 to the power of M, where M is each of n items of data after adding the input data and the corresponding weight absolute value.
3. The method of claim 1, wherein prior to adding the input data and the corresponding absolute value of the weight by the adder, further comprising:
multiplying the input data by a scaling factor K1And/or, multiplying the absolute value of the weight by a scaling factor K2,K1And K2Equal or unequal; or,
after the input data and the corresponding weight absolute value are added by the adder, the method further comprises the following steps:
multiplying each item in the n items of data after addition by a proportionality coefficient K3
Wherein, K1、K2、K3Not equal to 0.
4. The method of claim 3, wherein K is1、K2、K3Is 1+1/2NOr 1-1/2N
5. The method according to any one of claims 1 to 4,
when the input data or the weight absolute value is equal to 0, the accumulation operation is that the current accumulation item is kept unchanged;
when the sign bit of the weight is negative, the accumulation operation is a subtraction operation.
6. The method according to any of claims 1-4, wherein the first non-linear mapping or the second non-linear mapping or the accumulating operation is implemented by analog circuitry.
7. A neural network processor, comprising:
the adding circuit is used for adding input data and corresponding weight absolute values, wherein the input data are output data of a previous stage, and the input data and the weight absolute values are n-element vectors;
the first nonlinear mapping circuit is used for sequentially carrying out n times of first nonlinear mapping on the input data and n items of data obtained by adding corresponding weight absolute values;
the accumulation circuit is used for carrying out accumulation operation on the result after the first nonlinear mapping for n times, and the accumulation operation comprises addition operation and subtraction operation controlled by the sign bit of the weight;
and the second nonlinear mapping circuit is used for carrying out second nonlinear mapping on the result obtained after n times of accumulation operation to obtain a processing result and carrying out data output, and the second nonlinear mapping is formulated according to the rule of nonlinear mapping of the neural network and the inverse mapping of the first nonlinear mapping.
8. The neural network processor of claim 7, wherein the first non-linear mapping circuit is a power of 2 Mth transformation circuit, where M is each of n items of data after the input data and the corresponding weight absolute value are added.
9. The neural network processor of claim 7, further comprising:
a first multiplying circuit for multiplying the input data by a scaling factor K before the adding circuit adds the input data and the corresponding weight absolute value1(ii) a And/or the presence of a gas in the gas,
a second multiplying circuit for multiplying the weight absolute value by a scaling factor K before the adding circuit adds the input data and the corresponding weight absolute value2,K1And K2Equal or unequal;
or,
a third multiplying circuit for multiplying each of the n items of data after addition by a scaling factor K after the addition circuit adds the input data and the corresponding weight absolute value3
Wherein, K1、K2、K3Not equal to 0.
10. The neural network processor of claim 9, wherein K is1、K2、K3Is 1+1/2NOr 1-1/2N
11. The neural network processor of any one of claims 7-10, wherein when the input data or the absolute value of the weight is equal to 0, the accumulation operation maintains a current accumulation term unchanged;
when the sign bit of the weight is negative, the accumulation operation is a subtraction operation.
CN201610165618.2A 2016-03-22 2016-03-22 The data processing method and neural network processor of neural network processor Active CN105844330B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610165618.2A CN105844330B (en) 2016-03-22 2016-03-22 The data processing method and neural network processor of neural network processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610165618.2A CN105844330B (en) 2016-03-22 2016-03-22 The data processing method and neural network processor of neural network processor

Publications (2)

Publication Number Publication Date
CN105844330A CN105844330A (en) 2016-08-10
CN105844330B true CN105844330B (en) 2019-06-28

Family

ID=56587746

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610165618.2A Active CN105844330B (en) 2016-03-22 2016-03-22 The data processing method and neural network processor of neural network processor

Country Status (1)

Country Link
CN (1) CN105844330B (en)

Families Citing this family (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10552732B2 (en) * 2016-08-22 2020-02-04 Kneron Inc. Multi-layer neural network
WO2018113790A1 (en) * 2016-12-23 2018-06-28 北京中科寒武纪科技有限公司 Operation apparatus and method for artificial neural network
WO2018120019A1 (en) * 2016-12-30 2018-07-05 上海寒武纪信息科技有限公司 Compression/decompression apparatus and system for use with neural network data
US11315009B2 (en) * 2017-03-03 2022-04-26 Hewlett Packard Enterprise Development Lp Analog multiplier-accumulators
US11544545B2 (en) 2017-04-04 2023-01-03 Hailo Technologies Ltd. Structured activation based sparsity in an artificial neural network
US11238334B2 (en) 2017-04-04 2022-02-01 Hailo Technologies Ltd. System and method of input alignment for efficient vector operations in an artificial neural network
US10387298B2 (en) 2017-04-04 2019-08-20 Hailo Technologies Ltd Artificial neural network incorporating emphasis and focus techniques
US11551028B2 (en) 2017-04-04 2023-01-10 Hailo Technologies Ltd. Structured weight based sparsity in an artificial neural network
US11615297B2 (en) 2017-04-04 2023-03-28 Hailo Technologies Ltd. Structured weight based sparsity in an artificial neural network compiler
US11551067B2 (en) 2017-04-06 2023-01-10 Shanghai Cambricon Information Technology Co., Ltd Neural network processor and neural network computation method
CN109219821B (en) * 2017-04-06 2023-03-31 上海寒武纪信息科技有限公司 Arithmetic device and method
CN114970827A (en) * 2017-04-21 2022-08-30 上海寒武纪信息科技有限公司 Arithmetic device and method
CN107256424B (en) * 2017-05-08 2020-03-31 中国科学院计算技术研究所 Three-value weight convolution network processing system and method
CN107169563B (en) * 2017-05-08 2018-11-30 中国科学院计算技术研究所 Processing system and method applied to two-value weight convolutional network
EP3637272A4 (en) * 2017-06-26 2020-09-02 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
CN113076981A (en) 2017-06-30 2021-07-06 华为技术有限公司 Data processing method and device
CN109214509B (en) * 2017-07-05 2021-07-06 中国科学院沈阳自动化研究所 A high-speed real-time quantization structure and operation implementation method for deep neural network
CN107992329B (en) 2017-07-20 2021-05-11 上海寒武纪信息科技有限公司 Calculation method and related product
CN107292458B (en) * 2017-08-07 2021-09-10 北京中星微人工智能芯片技术有限公司 Prediction method and prediction device applied to neural network chip
CN109726805B (en) * 2017-10-30 2021-02-09 上海寒武纪信息科技有限公司 Method for designing neural network processor by using black box simulator
CN107832845A (en) * 2017-10-30 2018-03-23 上海寒武纪信息科技有限公司 A kind of information processing method and Related product
TWI657346B (en) * 2018-02-14 2019-04-21 倍加科技股份有限公司 Data reduction and method for establishing data identification model, computer system and computer-readable recording medium
CN111767998B (en) * 2018-02-27 2024-05-14 上海寒武纪信息科技有限公司 Integrated circuit chip device and related products
CN108510065A (en) * 2018-03-30 2018-09-07 中国科学院计算技术研究所 Computing device and computational methods applied to long Memory Neural Networks in short-term
CN110413255B (en) * 2018-04-28 2022-08-19 赛灵思电子科技(北京)有限公司 Artificial neural network adjusting method and device
CN110533174B (en) * 2018-05-24 2023-05-12 华为技术有限公司 Circuit and method for data processing in neural network system
CN109242091B (en) * 2018-09-03 2022-03-22 郑州云海信息技术有限公司 Image recognition method, device, equipment and readable storage medium
CN111105019B (en) * 2018-10-25 2023-11-10 上海登临科技有限公司 Neural network operation device and operation method
CN109376854B (en) * 2018-11-02 2022-08-16 矽魅信息科技(上海)有限公司 Multi-base logarithm quantization device for deep neural network
CN112970037B (en) * 2018-11-06 2024-02-02 创惟科技股份有限公司 Multi-chip system for implementing neural network applications, data processing method suitable for multi-chip system, and non-transitory computer readable medium
US11811421B2 (en) 2020-09-29 2023-11-07 Hailo Technologies Ltd. Weights safety mechanism in an artificial neural network processor
US11263077B1 (en) 2020-09-29 2022-03-01 Hailo Technologies Ltd. Neural network intermediate results safety mechanism in an artificial neural network processor
US11221929B1 (en) 2020-09-29 2022-01-11 Hailo Technologies Ltd. Data stream fault detection mechanism in an artificial neural network processor
US12248367B2 (en) 2020-09-29 2025-03-11 Hailo Technologies Ltd. Software defined redundant allocation safety mechanism in an artificial neural network processor
US11237894B1 (en) 2020-09-29 2022-02-01 Hailo Technologies Ltd. Layer control unit instruction addressing safety mechanism in an artificial neural network processor
CN112153081A (en) * 2020-11-24 2020-12-29 浙江齐安信息科技有限公司 Method for detecting abnormal state of industrial network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523453A (en) * 2011-12-29 2012-06-27 西安空间无线电技术研究所 Super large compression method and transmission system for images
CN103019656A (en) * 2012-12-04 2013-04-03 中国科学院半导体研究所 Dynamically reconfigurable multi-stage parallel single instruction multiple data array processing system
CN103501224A (en) * 2013-09-23 2014-01-08 长春理工大学 Asymmetric image encryption and decryption method based on quantum cell neural network system
CN105260773A (en) * 2015-09-18 2016-01-20 华为技术有限公司 Image processing device and image processing method
CN105389592A (en) * 2015-11-13 2016-03-09 华为技术有限公司 Method and apparatus for identifying image

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523453A (en) * 2011-12-29 2012-06-27 西安空间无线电技术研究所 Super large compression method and transmission system for images
CN103019656A (en) * 2012-12-04 2013-04-03 中国科学院半导体研究所 Dynamically reconfigurable multi-stage parallel single instruction multiple data array processing system
CN103501224A (en) * 2013-09-23 2014-01-08 长春理工大学 Asymmetric image encryption and decryption method based on quantum cell neural network system
CN105260773A (en) * 2015-09-18 2016-01-20 华为技术有限公司 Image processing device and image processing method
CN105389592A (en) * 2015-11-13 2016-03-09 华为技术有限公司 Method and apparatus for identifying image

Also Published As

Publication number Publication date
CN105844330A (en) 2016-08-10

Similar Documents

Publication Publication Date Title
CN105844330B (en) The data processing method and neural network processor of neural network processor
US11580719B2 (en) Dynamic quantization for deep neural network inference system and method
CN111758106B (en) Method and system for massively parallel neuro-reasoning computing elements
US10929746B2 (en) Low-power hardware acceleration method and system for convolution neural network computation
CN105260776A (en) Neural network processor and convolutional neural network processor
CN107340993B (en) Arithmetic device and method
CN112508125A (en) Efficient full-integer quantization method of image detection model
WO2022111002A1 (en) Method and apparatus for training neural network, and computer readable storage medium
CN111126557B (en) Neural network quantization, application method, device and computing equipment
WO2021081854A1 (en) Convolution operation circuit and convolution operation method
CN117351299B (en) Image generation and model training method, device, equipment and storage medium
CN112085154A (en) Asymmetric quantization for compression and inference acceleration of neural networks
CN112561050A (en) Neural network model training method and device
CN112558918B (en) Multiply-add operation method and device for neural network
CN111652359A (en) Multiplier Arrays for Matrix Operations and Multiplier Arrays for Convolution Operations
CN113436292B (en) Image processing method, training method, device and equipment of image processing model
CN111492369B (en) Residual quantization of shift weights in artificial neural networks
US10271051B2 (en) Method of coding a real signal into a quantized signal
CN117975211A (en) Image processing method and device based on multi-mode information
CN115760614A (en) Image denoising method and device, electronic equipment and storage medium
CN115952847A (en) Processing method and processing device of neural network model
CN116188875B (en) Image classification method, device, electronic equipment, medium and product
CN112183731A (en) Point cloud-oriented high-efficiency binarization neural network quantization method and device
JP7506276B2 (en) Implementations and methods for processing neural networks in semiconductor hardware - Patents.com
CN110929858A (en) Arithmetic framework system and method for manipulating floating-point to fixed-point arithmetic frameworks

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant