CN105844330B

CN105844330B - The data processing method and neural network processor of neural network processor

Info

Publication number: CN105844330B
Application number: CN201610165618.2A
Authority: CN
Inventors: 费旭东
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2016-03-22
Filing date: 2016-03-22
Publication date: 2019-06-28
Anticipated expiration: 2036-03-22
Also published as: CN105844330A

Abstract

The embodiment of the present invention provides the data processing method and neural network processor of a kind of neural network processor.This method comprises: input data is added with corresponding weight absolute value by adder, input data is the data of previous stage output, input data and weight absolute value are n member vector, and input data and corresponding weight absolute value n item data after being added are successively carried out the first Nonlinear Mapping of n times.Result after first Nonlinear Mapping is subjected to n times accumulation operations by accumulator, accumulation operations include the add operation and subtraction operation of weight sign bit control, the second Nonlinear Mapping of result progress after n times accumulation operations is obtained into processing result and carries out data output, the inverse mapping of rule and the first Nonlinear Mapping that the second Nonlinear Mapping is mapped according to Neural Network Based Nonlinear is formulated.To improve quantitative efficiency, storage demand and the bandwidth demand of data are reduced.

Description

Data processing method of neural network processor and neural network processor

Technical Field

The embodiment of the invention relates to the technical field of electronic chips, in particular to a data processing method of a neural network processor and the neural network processor.

Background

Neural networks and deep learning algorithms have been used with great success and are in the process of rapid development, and it is widely expected in the industry that new computing methods will help to realize more common and complex intelligent applications. In recent years, a neural network and a deep learning algorithm have achieved a very outstanding achievement in the field of image recognition application, so that the industry has attracted attention and paid attention to optimization and high-efficiency implementation of the neural network and the deep learning algorithm, and companies such as facebook, Qualcomm, baidu, google and the like have all invested in research on the neural network optimization algorithm. Qualcomm issues plans to integrate neural network processing modules in next-generation chips, and it is the core issue of attention and research to improve the processing efficiency of neural network algorithms, improve related algorithms, and the efficiency of chip implementation.

FIG. 1 is a schematic diagram of an n-level (layer) neural network computational model, wherein the neural network processes the computational form of one neuron: and y ═ f (x1 × w1+ x2 × w2+ … + xn × wn + b), the calculation is performed in stages, and the output of the previous stage is the input of the next stage. Fig. 2 is a flow chart of a conventional calculation method, in which preceding stage outputs are used as data inputs (x1, x2, … xn), x1, x2, … xn are multiplied by corresponding weight parameters, an accumulator is used to complete the accumulation operation of x1 w1+ x2 w2+ … + xn wn + b, and then a non-linear mapping y ═ f (the accumulated result) is used to obtain a calculation result, and finally data output is completed.

It can be seen that, in the above data processing method, because the computation complexity of the multiplication involved is relatively high, under a certain computation accuracy requirement, the storage requirement and bandwidth requirement of the corresponding data are also relatively high, and the computation efficiency is not high.

Disclosure of Invention

The embodiment of the invention provides a data processing method of a neural network processor and the neural network processor, and aims to solve the problems of high data storage requirement and bandwidth requirement and low calculation efficiency in the existing processing method.

In a first aspect, an embodiment of the present invention provides a data processing method for a neural network processor, where the method includes: firstly, adding input data and corresponding weight absolute values by an adder, wherein the input data is output data of a previous stage, and the input data and the weight absolute values are n-element vectors. And then sequentially carrying out n times of first nonlinear mapping on the input data and the n items of data obtained by adding the corresponding weight absolute values. And then the result after the first nonlinear mapping is subjected to n times of accumulation operations through an accumulator, wherein the accumulation operations comprise addition operations and subtraction operations controlled by the sign bit of the weight. And finally, performing second nonlinear mapping on the result obtained after the n times of accumulation operation to obtain a processing result and outputting data, wherein the second nonlinear mapping is formulated according to the rule of the nonlinear mapping of the neural network and the inverse mapping of the first nonlinear mapping. Therefore, the complex multiplication calculation is converted into the addition calculation, the quantization efficiency is improved, and the storage capacity and the bandwidth can be compressed, so that the storage requirement and the bandwidth requirement of data are reduced, and the calculation efficiency is improved. And the input data is not limited to 0/1 binary quantization, so that the calculation precision meets the requirement of an actual application network, and the method can be applied to a wider range of application targets besides neural network calculation.

In one possible design, the first non-linear mapping is an M-th power transform of 2, where M is the term in n terms of the input data plus the corresponding absolute value of the weight. The M power transformation of 2 is adopted, and the hardware implementation cost is low due to the simple mapping relation.

In one possible design, when the first non-linear mapping is an M-th power transformation of M, and M is not equal to 2, to simplify circuit implementation, the method further includes, before adding the input data and the corresponding weight absolute value by the adder, converting the M-th power transformation of M into the M-th power transformation of 2: multiplying input data by a scaling factor K₁And/or, multiplying the absolute value of the weight by a scaling factor K₂，K₁And K₂Equal or unequal; or, after the input data and the corresponding weight absolute value are added by the adder, the method further includes: multiplying each item in the n items of data after addition by a proportionality coefficient K₃. Wherein, K₁、K₂、K₃Not equal to 0.

In one possible design, K₁、K₂、K₃Is 1+1/2^NOr 1-1/2^N。

In one possible design, when the absolute value of the input data or weight is equal to 0, the accumulation operation remains unchanged for the current accumulation term. When the sign bit of the weight is negative, the accumulation operation is a subtraction operation. The accumulator is in a maintenance state, and since a large number of 0 exists in the actual neural network calculation process regardless of the number input data or the weight, the processing can be simplified, and the power consumption can be reduced.

In one possible design, the first or second non-linear mapping or accumulation operation is implemented by analog circuitry. The analog nonlinear conversion and the addition can be realized instantaneously without depending on the speed of a digital clock.

In a second aspect, an embodiment of the present invention provides a neural network processor, including: and the addition circuit is used for adding input data and corresponding weight absolute values, the input data is data output by a previous stage, and the input data and the weight absolute values are n-element vectors. And the first nonlinear mapping circuit is used for sequentially carrying out n times of first nonlinear mapping on the n items of data obtained by adding the input data and the corresponding weight absolute values. And the accumulation circuit is used for carrying out accumulation operation on the result after the first nonlinear mapping for n times, and the accumulation operation comprises addition operation and subtraction operation controlled by the sign bit of the weight. And the second nonlinear mapping circuit is used for carrying out second nonlinear mapping on the result obtained after the n times of accumulation operation to obtain a processing result and outputting data, and the second nonlinear mapping is formulated according to the rule of the nonlinear mapping of the neural network and the inverse mapping of the first nonlinear mapping.

In one possible design, the first non-linear mapping circuit is a power of 2M transformation circuit, and M is each item of n items of data after the input data and the corresponding weight absolute value are added.

In one possible design, the method further comprises: a first multiplier circuit for adding the input data and the corresponding dataMultiplying the input data by a scaling factor K before adding the absolute values of the weights₁(ii) a And/or a second multiplying circuit for multiplying the weight absolute value by a scaling factor K before the adding circuit adds the input data and the corresponding weight absolute value₂，K₁And K₂Equal or unequal; or, a third multiplying circuit for multiplying each of the n items of data after addition by the scaling factor K after the addition circuit adds the input data and the corresponding weight absolute value₃(ii) a Wherein, K₁、K₂、K₃Not equal to 0.

In one possible design, K₁、K₂、K₃Is 1+1/2^NOr 1-1/2^N。

In one possible design, when the absolute value of the input data or weight is equal to 0, the accumulation operation is that the current accumulation item remains unchanged; when the sign bit of the weight is negative, the accumulation operation is a subtraction operation.

The beneficial effects of the neural network processor provided in the second aspect and in each possible design of the second aspect may refer to the beneficial effects brought by the first aspect and each possible design of the first aspect, and are not described herein again.

The data processing method of the neural network processor and the neural network processor provided by the embodiment of the invention add input data and corresponding weight absolute values through an adder, the input data is data output by a previous stage, then n items of data obtained by adding the input data and the corresponding weight absolute values are sequentially subjected to n times of first nonlinear mapping, the result obtained after the first nonlinear mapping is subjected to n times of accumulation operation through an accumulator, the accumulation operation comprises addition operation and subtraction operation controlled by weight sign bits, finally the result obtained after the n times of accumulation operation is subjected to second nonlinear mapping to obtain a processing result and output the data, and the second nonlinear mapping is formulated according to the rule of the neural network nonlinear mapping and the inverse mapping of the first nonlinear mapping. Therefore, the complex multiplication calculation is converted into the addition calculation, the quantization efficiency is improved, and the storage capacity and the bandwidth can be compressed, so that the storage requirement and the bandwidth requirement of data are reduced, and the calculation efficiency is improved. And the input data is not limited to 0/1 binary quantization, so that the calculation precision meets the requirement of an actual application network, and the method can be applied to a wider range of application targets besides neural network calculation.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a schematic diagram of an n-level (layer) neural network computational model;

FIG. 2 is a flow chart of a conventional computing method;

FIG. 3 is a flowchart of a first embodiment of a data processing method of a neural network processor according to the present invention;

FIG. 4 is a block diagram of a second embodiment of a data processing method of a neural network processor according to the present invention;

FIG. 5 is a block diagram of a data processing method of a neural network processor according to a third embodiment of the present invention;

FIG. 6 is a schematic diagram of the transformation of the 2 nd power in the third embodiment of the data processing method of the neural network processor according to the present invention;

FIG. 7 is a diagram illustrating a first embodiment of a neural network processor;

FIG. 8 is a diagram illustrating a second embodiment of a neural network processor;

fig. 9 is a schematic structural diagram of a third embodiment of the neural network processor of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments of the present invention without any creative efforts shall fall within the protection scope of the embodiments of the present invention.

The embodiment of the invention provides a data processing method of a neural network processor and the neural network processor, which can be applied to scenes that data such as image data, voice data, video data and the like need to be subjected to neural network calculation, and received data is used as input data to be subjected to neural network calculation (single-stage neural network calculation or multi-stage neural network calculation).

The neural network processor provided by the embodiment of the invention may be in a physical entity form. For example, in a cloud server application, the processing chip may be an independent processing chip, and in a terminal (e.g., a mobile phone) application, the processing chip may be a module in a terminal processor chip. The information input is from various information inputs such as voice, images, natural language and the like which need intelligent processing, and data to be subjected to neural network operation is formed through necessary preprocessing (such as sampling, analog-to-digital conversion, feature extraction and the like). The output of the information is sent to other subsequent processing modules or software, such as graphics or other representations that are understandable to be available. In the cloud application form, the processing units at the front and rear stages of the neural network processor may be assumed by other server operation units, for example, and in the terminal application environment, the processing units at the front and rear stages of the neural network processor may be completed by other parts (including sensors, interface circuits, and the like) of the software and hardware of the terminal.

Fig. 3 is a flowchart of a first embodiment of a data processing method of a neural network processor, as shown in fig. 1, the method of this embodiment may include:

and S101, adding the input data and the corresponding weight absolute value through an adder, wherein the input data is the data output by the previous stage, and the input data and the weight absolute value are n-element vectors.

For a multi-stage neural network, the output of a previous stage serves as the input of a subsequent stage. The weights include weight absolute values and weight sign bits.

S102, sequentially carrying out n times of first nonlinear mapping on the input data and n items of data obtained by adding corresponding weight absolute values.

The first non-linear mapping may be an M-th power transformation of an arbitrary base number M, and may also be in the form of y ═ a × M^BAnd y is a non-linear transformation of B. Preferably, the first non-linear mapping is an M-th power of 2 transformation, M being the term in the input data plus the corresponding absolute value of the weight. The M power transformation of 2 is adopted, and the hardware implementation cost is low due to the simple mapping relation.

And S103, accumulating the result after the first nonlinear mapping for n times through an accumulator.

The accumulation operation comprises an addition operation and a subtraction operation controlled by the sign bit of the weight, namely, if the sign bit of the weight is negative, the accumulator performs the subtraction operation, the sign bit of the weight is positive, and the accumulator performs the addition operation.

And S104, performing second nonlinear mapping on the result obtained after the n times of accumulation operation to obtain a processing result and outputting data, wherein the second nonlinear mapping is formulated according to the rule of the nonlinear mapping of the neural network and the inverse mapping of the first nonlinear mapping.

Wherein, when the first non-linear mapping is M power transformation of M, and M is not equal to 2, in order to make the circuit implementation simple, the M power transformation of M is converted into 2 power transformation, specifically by: before the input data and the corresponding weight absolute value are added by the adder in S101, the method further includes:

multiplying input data by a scaling factor K₁And/or, multiplying the absolute value of the weight by a scaling factor K₂，K₁And K₂Equal or unequal; or, after the input data and the corresponding weight absolute value are added by the adder, the method further includes: multiplying each item in the n items of data after addition by a proportionality coefficient K₃. Wherein, K₁、K₂、K₃Not equal to 0, optionally K₁、K₂、K₃May be 1+1/2^NOr 1-1/2^N。

Through the operation, the M-power transformation of M can be converted into the M-power transformation of 2 when the circuit is realized, and the mapping relation of the M-power transformation of 2 is simple, so that the hardware realization cost is low.

Further, there are cases where the input data or the weight absolute value is equal to 0, and there are cases where the weight is also a negative number, and therefore, when the input data or the weight absolute value is equal to 0, the accumulation operation is such that the current accumulation term remains unchanged. The accumulator is in a maintenance state, and since a large number of 0 exists in the actual neural network calculation process regardless of the number input data or the weight, the processing can be simplified, and the power consumption can be reduced.

When the sign bit of the weight is negative, the accumulation operation is subtraction operation, i.e. the accumulator performs subtraction operation. Subtraction is the addition of 1 to the original number.

Alternatively, the first or second non-linear mapping or accumulation operation may be implemented by analog circuitry. The analog nonlinear conversion and the addition can be realized instantaneously without depending on the speed of a digital clock.

It should be noted that the adder, the accumulator and the multiplier mentioned above are discrete and physical circuits, and are not software modules implemented by a general-purpose CPU in the embodiment of the present invention.

The data processing method of the neural network processor provided in this embodiment is to add input data and a corresponding weight absolute value by an adder, where the input data is data output from a previous stage, then sequentially perform n times of first nonlinear mapping on n items of data obtained by adding the input data and the corresponding weight absolute value, perform n times of accumulation operation on a result obtained after the first nonlinear mapping by an accumulator, where the accumulation operation includes addition operation and subtraction operation controlled by a weight sign bit, and finally perform second nonlinear mapping on the result obtained after the n times of accumulation operation to obtain a processing result and output the processing result, where the second nonlinear mapping is formulated according to a rule of nonlinear mapping of the neural network and inverse mapping of the first nonlinear mapping. Therefore, the complex multiplication calculation is converted into the addition calculation, the quantization efficiency is improved, and the storage capacity and the bandwidth can be compressed, so that the storage requirement and the bandwidth requirement of data are reduced, and the calculation efficiency is improved. And the input data is not limited to 0/1 binary quantization, so that the calculation precision meets the requirement of an actual application network, and the method can be applied to a wider range of application targets besides neural network calculation.

The following describes in detail the processing procedure of the data processing method of the neural network processor according to the embodiment of the present invention, with reference to the formula derivation procedure and a specific embodiment.

Specifically, M-th power transformation is adopted to convert complex multiplication calculation into addition calculation, and the detailed calculation formula derivation process is as follows:

taking the calculations mentioned in the background art as examples for forms such as y ═ f (x1 x w1+ x2 x w2+ … + xn x wn + b),

let x >0, s (W) be the sign bit of W.

y＝f(s(w1)e^{ln(x1)+ln(|w1|)}+s(w2)e^{ln(x2)+ln(|w2|)}+…s(wn)e^{ln(xn)+ln(|wn|)}+b)

ln(y)＝ln(f(s(w1)e^{ln(x1)+ln(|w1|)}+s(w2)e^{ln(x2)+ln(|w2|)}+…s(wn)e^{ln(xn)+ln(|wn|)}+b))

Assuming that u ═ ln (y), v ═ ln (x), and c ═ ln (| w |), this equation is reformulated as:

u＝ln(f(s(w1)e^v1+c1+s(w2)e^v2+c2+…s(wn)e^vn+cn+b))

if the exponential relationship is an exponent of 2 and the logarithmic relationship is a base-2 logarithm, the calculation formula is written as:

y＝f(s(w1)2^log ₂ ^(x1)+log ₂ ^(|w1|)+s(w2)2^log ₂ ^(x2)+log ₂ ^(|w2|)+…+s(wn)2^log ₂ ^(xn)+log ₂ ^(|wn|)…+b)

log₂y＝log₂f(s(w1)2^log ₂ ^(x1)+log ₂ ^(|w1|)+s(w2)2^log ₂ ^(x2)+log ₂ ^(|w2|)+…+s(wn)2^log ₂ ^(xn)+log ₂ ^(|wn|)…+b)

let u be log₂y，v＝log₂x，c＝log₂I w I, this formula is re-expressed as:

u＝log₂f(s(w1)2^v1+c1+s(w2)2^v2+c2+…s(wn)2^vn+cn+b)

let g (x) be 2^(x)，ff(x)＝log₂f (x), the calculation formula is abbreviated as:

u＝ff(s(w1)g(v1+c1)+s(w2)g(v2+c2)+…s(wn)g(vn+cn)+b)

wherein g (x) corresponds to a first non-linear mapping, and ff (x) corresponds to a second non-linear mapping. Fig. 4 is a calculation block diagram of a second embodiment of the data processing method of the neural network processor of the present invention, as shown in fig. 4,the preceding stage outputs input data, the input data and the corresponding weight absolute value are n-element vectors, the input data and the corresponding weight absolute value are added through an adder to obtain n items of added data v1+ c1, v2+ c2, vn + cn, and the n items of added data v1+ c1, v2+ c2, vn + cn are subjected to n times of first nonlinear mapping (g (x) is 2 times)^(x)) To obtain the result 2 after the first non-linear mapping^v1+c1、2^v2+c2、、、2^vn+cnThen n times of accumulation operation to obtain 2^v1+c1+2^v2+c2+…2^vn+cn+ b, if the input data or weight is 0, the current accumulation term remains unchanged. If the sign bit of the weight is negative, the accumulator performs subtraction. And finally, performing second nonlinear mapping (ff (x)) on the result obtained after the n times of accumulation operation to obtain a processing result and outputting data.

If the exponential relationship is an exponent with a base m and the logarithmic relationship is a logarithm with a base m, the calculation formula is written as:

y＝f(s(w1)m^log _m ^(x1)+log _m ^(|w1|)+s(w2)m^log _m ^(x2)+log _m ^(|w2|)+…+s(wn)m^log _m ^(xn)+log _m ^(|wn|)…+b)

log_my＝log_mf(s(w1)m^log _m ^(x1)+log _m ^(|w1|)+s(w2)m^log _m ^(x2)+log _m ^(|w2|)+…+s(wn)m^log _m ^(xn)+log _m ^(|wn|)…+b)

let u be log_my，v＝log_mx，c＝log_mI w I, this formula is re-expressed as:

u＝log_mf(s(w1)m^v1+c1+s(w2)m^v2+c2+…+s(wn)m^vn+cn+b)

u＝log_mf(s(w1)2^(v1+c1)*log ₂ ^m+s(w2)2^(v2+c2)*log ₂ ^m+…+s(wn)2^(vn+cn)*log ₂ ^m+b)

let g (x) be 2^(x)，ff(x)＝log_mf(x)，k＝log₂m, then the calculation formula is abbreviated as:

u＝ff(s(w1)g(c1*k+v1*k)+s(w2)g(c2*k+v2*k)+…s(wn)g(cn*k+vn*k)+b)

where g (x) corresponds to a first non-linear mapping, and ff (x) corresponds to a second non-linear mapping. As shown in fig. 5, fig. 5 is a calculation block diagram of a third embodiment of the data processing method of the neural network processor of the present invention. K in the above calculation formula is a multiplied proportionality coefficient, and M-th power transformation can be converted into 2-th M-th power transformation by multiplying the input data and the corresponding weight absolute value by the proportionality coefficient k before or after the adder adds the input data and the corresponding weight absolute value, so that the mapping relation is simple, the circuit implementation is simple, and further the hardware implementation cost is low. With reference to fig. 5, the method of the present embodiment includes:

s201, data output from the n-1 level data output unit (namely, output of a previous level) serves as input data of n levels.

And S202, the adder of the n stages adds the input data of the stage and the corresponding weight absolute value.

S203, multiplying the channel of the addition of the input data and the weight absolute value or the result of the addition of the input data and the corresponding weight absolute value by a scaling factor k, wherein the scaling factor k is a common scaling factor k, for example, 1+1/2^N，1-1/2^NThe scaling factor of (c).

S204, sequentially carrying out n times of first nonlinear mapping on the input data and n items of data obtained by adding corresponding weight absolute values.

The first non-linear mapping is, for example, a 2-raised power transform that can be designed simply. Fig. 6 is a schematic diagram of the transformation of the power of 2M in the third embodiment of the data processing method of the neural network processor of the present invention, as shown in fig. 6, specifically including:

s2041, in the added data, the floating point portion of the data is converted into a binary number reflecting the actual size of the floating point weight through a decoder circuit.

S2042, and completing exponential transformation mapping of the data through table lookup or a simplified combination circuit.

S2043, the data after the exponential transformation and the floating point weight are combined through a combination circuit to complete very simple 1-bit to n-bit multiplication to form a final result.

And S205, the accumulator performs accumulation operation on the accumulated items output after the first nonlinear mapping for n times. If the input data or weight is 0, the current accumulation term remains unchanged. If the sign bit of the weight is negative, the accumulator performs subtraction.

And S206, performing second nonlinear mapping on the result after the n times of accumulation operation. Generally, the second non-linear mapping is formulated according to the rules of the neural network non-linear mapping and the inverse of the first non-linear mapping. A special case of the transformation is a cascade comprising 2 transformations, 1 of which is the original non-linear transformation of the neural network, such as Sigmoid, ReLU, etc., and 2 of which is the logarithmic transformation that adjusts the data distribution law.

And S207, outputting the processing result.

For example, corresponding to one shape: and y is f (x1 w1+ x2 w2+ … + xn wn + b), the original data precision of x, w and y requires a floating point (32bit) with single precision and at least a fixed point with 16 bits, the corresponding data bandwidth is calculated in proportion to 16 bits, and a multiplier with 16 bits is required.

First, x, w, and y may be logarithmically transformed, where a is log_mB. Experiments show that 8-bit quantization of transformed data can meet practical requirements, and even 6-bit and 4-bit quantization can be further achieved.

The transformed data is denoted as P.Q, where P is the portion before the decimal point, Q is the portion after the decimal point, P has P bits, Q has Q bits, and P + Q is the total number of bits of the data.

In this embodiment, the corresponding transformed x, w, y data is denoted by pv.

Therefore, the addition calculation after the logarithmic transformation is changed to:

case of no carry: px, Qx + Pw, Qw ═ (Px + Pw) · (Qx + Qw), or

Case with carry: px, Qx + Pw, Qw ═ (Px + Pw +1). (Qx + Qw-1)

Recording as follows: qt of Pt

Qt is simply multiplied by the following factor: log₂m, by this multiplication, m needed to be done subsequently can be obtained^Pt.QtIs converted into 2^Ps.QsTo do so. And ps.qs.pt.qt.log₂m。

In most cases, taking a logarithm or an exponent of 2 may satisfy the application requirements. In special cases, if the properly adjusted logarithmic relationship can better meet the requirements of the application, it can be properly adjusted by multiplying this coefficient. The adjustment range which can be satisfied by one-time addition is of the type: 1+1/2^N，1-1/2^NAny coefficient of (a). Assuming that Pt is 4 bits and Qt is 4 bits, in the subsequent calculation, Pt is directly sent to a decoder of 4 lines to 16 lines, and Qt is directly sent to a look-up table of 4 bits to 4 bits or a simple combination circuit reflecting the data rule of exponential mapping.

Fig. 7 is a schematic structural diagram of a first embodiment of a neural network processor of the present invention, and as shown in fig. 7, the neural network processor of the present embodiment may include: an adding circuit 11, a first non-linear mapping circuit 12, an accumulating circuit 13 and a second non-linear mapping circuit 14. The adding circuit 11 is configured to add input data and corresponding weight absolute values, where the input data is data output by a previous stage, and the input data and the weight absolute values are n-ary vectors. The first nonlinear mapping circuit 12 is configured to perform first nonlinear mapping on n items of data obtained by adding the input data and the corresponding weight absolute values n times in sequence. The accumulation circuit 13 is configured to perform n accumulation operations on the first non-linear mapped result, where the accumulation operations include an addition operation and a subtraction operation controlled by the sign bit of the weight. The second nonlinear mapping circuit 14 is configured to perform second nonlinear mapping on the result obtained after the n-time accumulation operations to obtain a processing result and output data, where the second nonlinear mapping is formulated according to a rule of nonlinear mapping of the neural network and an inverse mapping of the first nonlinear mapping.

The first nonlinear mapping circuit 12 may be an M-th power conversion circuit with an arbitrary base number M, and may also be in the form of y-a-M^BAnd y is B and B. Preferably, the first nonlinear mapping circuit 12 is a power of 2M transform circuit, where M is each of n items of data after the input data and the corresponding weight absolute value are added.

Further, when the first nonlinear mapping circuit 12 is an M-th power transformation circuit of M, and M is not equal to 2, to make the circuit implementation simple, the M-th power transformation of M is transformed into an M-th power transformation of 2, specifically, fig. 8 is a schematic structural diagram of a second embodiment of the neural network processor of the present invention, as shown in fig. 8, on the basis of the neural network processor shown in fig. 7, further comprising: a first multiplying circuit 15 for multiplying the input data by a scaling factor K before the adding circuit 11 adds the input data and the corresponding weight absolute value₁(ii) a And/or a second multiplying circuit 16 for multiplying the absolute value of the weight by a scaling factor K before the adding circuit 11 adds the input data and the corresponding absolute value of the weight₂，K₁And K₂Equal or unequal.

Fig. 9 is a schematic structural diagram of a third embodiment of the neural network processor of the present invention, as shown in fig. 9, on the basis of the neural network processor shown in fig. 7, further including: a third multiplying circuit 17 for multiplying each of the n items of data after addition by the scaling factor K after the addition circuit 11 adds the input data and the corresponding weight absolute value₃。

Wherein, K₁、K₂、K₃Not equal to 0. Optionally, K₁、K₂、K₃May be 1+1/2^NOr 1-1/2^N。

Further, when the input data or the weight absolute value is equal to 0, the accumulation operation is that the current accumulation item is kept unchanged; when the sign bit of the weight is negative, the accumulation operation is a subtraction operation. The accumulator is in a maintenance state, and since a large number of 0 exists in the actual neural network calculation process regardless of the number input data or the weight, the processing can be simplified, and the power consumption can be reduced.

The apparatus of this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 3, and the implementation principle thereof is similar, which is not described herein again.

The neural network processor provided in this embodiment adds input data and a corresponding weight absolute value by using an addition circuit, where the input data is data output from a previous stage, then the first nonlinear mapping circuit sequentially performs n times of first nonlinear mapping on n items of data obtained by adding the input data and the corresponding weight absolute value, the accumulation circuit performs n times of accumulation operation on a result obtained by the first nonlinear mapping by using an accumulator, the accumulation operation includes addition operation and subtraction operation controlled by a weight sign bit, and finally the second nonlinear mapping circuit performs second nonlinear mapping on the result obtained by the n times of accumulation operation to obtain a processing result and outputs data, and the second nonlinear mapping is formulated according to a rule of the neural network nonlinear mapping and inverse mapping of the first nonlinear mapping. Therefore, the complex multiplication calculation is converted into the addition calculation, the quantization efficiency is improved, and the storage capacity and the bandwidth can be compressed, so that the storage requirement and the bandwidth requirement of data are reduced, and the calculation efficiency is improved. And the input data is not limited to 0/1 binary quantization, so that the calculation precision meets the requirement of an actual application network, and the method can be applied to a wider range of application targets besides neural network calculation.

Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present invention, and are not limited thereto; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A data processing method of a neural network processor, comprising:

adding input data and corresponding weight absolute values through an adder, wherein the input data are output data of a previous stage, and the input data and the weight absolute values are n-element vectors;

sequentially carrying out n times of first nonlinear mapping on the input data and n items of data obtained by adding corresponding weight absolute values;

performing n times of accumulation operation on the result after the first nonlinear mapping through an accumulator, wherein the accumulation operation comprises addition operation and subtraction operation controlled by weight sign bits;

and performing second nonlinear mapping on the result obtained after the n times of accumulation operation to obtain a processing result and outputting data, wherein the second nonlinear mapping is formulated according to the rule of the nonlinear mapping of the neural network and the inverse mapping of the first nonlinear mapping.

2. The method of claim 1, wherein the first non-linear mapping is a transformation of 2 to the power of M, where M is each of n items of data after adding the input data and the corresponding weight absolute value.

3. The method of claim 1, wherein prior to adding the input data and the corresponding absolute value of the weight by the adder, further comprising:

multiplying the input data by a scaling factor K₁And/or, multiplying the absolute value of the weight by a scaling factor K₂，K₁And K₂Equal or unequal; or,

after the input data and the corresponding weight absolute value are added by the adder, the method further comprises the following steps:

multiplying each item in the n items of data after addition by a proportionality coefficient K₃；

Wherein, K₁、K₂、K₃Not equal to 0.

4. The method of claim 3, wherein K is₁、K₂、K₃Is 1+1/2^NOr 1-1/2^N。

5. The method according to any one of claims 1 to 4,

when the input data or the weight absolute value is equal to 0, the accumulation operation is that the current accumulation item is kept unchanged;

when the sign bit of the weight is negative, the accumulation operation is a subtraction operation.

6. The method according to any of claims 1-4, wherein the first non-linear mapping or the second non-linear mapping or the accumulating operation is implemented by analog circuitry.

7. A neural network processor, comprising:

the adding circuit is used for adding input data and corresponding weight absolute values, wherein the input data are output data of a previous stage, and the input data and the weight absolute values are n-element vectors;

the first nonlinear mapping circuit is used for sequentially carrying out n times of first nonlinear mapping on the input data and n items of data obtained by adding corresponding weight absolute values;

the accumulation circuit is used for carrying out accumulation operation on the result after the first nonlinear mapping for n times, and the accumulation operation comprises addition operation and subtraction operation controlled by the sign bit of the weight;

and the second nonlinear mapping circuit is used for carrying out second nonlinear mapping on the result obtained after n times of accumulation operation to obtain a processing result and carrying out data output, and the second nonlinear mapping is formulated according to the rule of nonlinear mapping of the neural network and the inverse mapping of the first nonlinear mapping.

8. The neural network processor of claim 7, wherein the first non-linear mapping circuit is a power of 2 Mth transformation circuit, where M is each of n items of data after the input data and the corresponding weight absolute value are added.

9. The neural network processor of claim 7, further comprising:

a first multiplying circuit for multiplying the input data by a scaling factor K before the adding circuit adds the input data and the corresponding weight absolute value₁(ii) a And/or the presence of a gas in the gas,

a second multiplying circuit for multiplying the weight absolute value by a scaling factor K before the adding circuit adds the input data and the corresponding weight absolute value₂，K₁And K₂Equal or unequal;

or,

a third multiplying circuit for multiplying each of the n items of data after addition by a scaling factor K after the addition circuit adds the input data and the corresponding weight absolute value₃；

Wherein, K₁、K₂、K₃Not equal to 0.

10. The neural network processor of claim 9, wherein K is₁、K₂、K₃Is 1+1/2^NOr 1-1/2^N。

11. The neural network processor of any one of claims 7-10, wherein when the input data or the absolute value of the weight is equal to 0, the accumulation operation maintains a current accumulation term unchanged;