[go: up one dir, main page]

CN113312021B - Approximate hybrid divider circuit based on array and logarithmic divider - Google Patents

Approximate hybrid divider circuit based on array and logarithmic divider Download PDF

Info

Publication number
CN113312021B
CN113312021B CN202010126403.6A CN202010126403A CN113312021B CN 113312021 B CN113312021 B CN 113312021B CN 202010126403 A CN202010126403 A CN 202010126403A CN 113312021 B CN113312021 B CN 113312021B
Authority
CN
China
Prior art keywords
divider
bits
logarithmic
array
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010126403.6A
Other languages
Chinese (zh)
Other versions
CN113312021A (en
Inventor
徐涛
刘伟强
王成华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202010126403.6A priority Critical patent/CN113312021B/en
Publication of CN113312021A publication Critical patent/CN113312021A/en
Application granted granted Critical
Publication of CN113312021B publication Critical patent/CN113312021B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/535Dividing only

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

本发明提供一种基于阵列除法器和对数除法器的近似混合除法器电路,电路中改进的阵列除法器模块用来确保精确度的要求,对数除法器模块用来实现硬件性能上的提升。该电路采用截断的方式,提出了近似深度的概念,将操作数配置成不同长度的位宽分配给阵列除法器和对数除法器,从而可以配置成不同的精度和硬件资源的需求。用户可根据需求选择最合适的截断方式(即合适的近似深度),在满足需求的同时,尽可能的减少其他不必要的消耗。该电路同之前已提出的近似阵列除法器相比,使用了更少的硬件资源,大大降低了单位成本,而精确度损失在10‑3~10‑4范围内。同对数除法器相比,精确度上有很大的提升。

The present invention provides an approximate hybrid divider circuit based on an array divider and a logarithmic divider, wherein the improved array divider module in the circuit is used to ensure the accuracy requirement, and the logarithmic divider module is used to achieve the improvement in hardware performance. The circuit adopts a truncation method, proposes the concept of approximate depth, and configures the operands into bit widths of different lengths to be allocated to the array divider and the logarithmic divider, so that different accuracy and hardware resource requirements can be configured. The user can select the most suitable truncation method (i.e., the appropriate approximate depth) according to the requirements, and while meeting the requirements, other unnecessary consumption is reduced as much as possible. Compared with the previously proposed approximate array divider, the circuit uses fewer hardware resources, greatly reduces the unit cost, and the accuracy loss is within the range of 10 ‑3 to 10 ‑4 . Compared with the logarithmic divider, the accuracy is greatly improved.

Description

基于阵列和对数除法器的近似混合除法器电路Approximate hybrid divider circuit based on array and logarithmic divider

技术领域:Technical field:

本发明涉及基于近似电路设计领域,尤其涉及一种基于阵列和对数除法器的近似混合除法器电路。The invention relates to the field of approximate circuit design, and in particular to an approximate hybrid divider circuit based on an array and a logarithmic divider.

背景技术:Background technology:

随着对数字系统更高计算速度的需求不断增长,对移动设备以及其他非移动设备伴随着低功耗/高能源效率的要求。因此迫切需要一种新的计算效率来源,来设计更高效(更快、功耗更低)的计算平台。As the demand for higher computing speeds in digital systems continues to grow, along with the accompanying requirements for low power/high energy efficiency in mobile and other non-mobile devices, a new source of computing efficiency is urgently needed to design more efficient (faster, lower power) computing platforms.

近似计算作为一种低功耗设计方法被广泛关注,其思想来自应用程序对不精确结果的容忍度,核心原理就是通过产生足够好或足够质量的结果进行有效计算。这种有效计算是设计者能够利用这种误差容忍度来位设计优化增加一个新的维度,在这个维度中,计算精度被用来降低功耗和设计复杂性。近似计算与精确计算的区别在于,精确计算会导致计算系统需要更高的功率以及硬件复杂度,而近似计算方法中,以精度为代价提高了系统性能和降低了硬件复杂度,并且这种能量的节省非常显著。Approximate computing has attracted widespread attention as a low-power design method. Its idea comes from the tolerance of applications to inaccurate results. The core principle is to perform effective calculations by producing results that are good enough or of sufficient quality. This effective calculation allows designers to use this error tolerance to add a new dimension to design optimization, in which calculation accuracy is used to reduce power consumption and design complexity. The difference between approximate computing and precise computing is that precise computing will cause the computing system to require higher power and hardware complexity, while in the approximate computing method, system performance is improved and hardware complexity is reduced at the cost of accuracy, and this energy saving is very significant.

现有的精确除法器有关键路径长,延迟大,功耗高,面积大等诸多缺点。但对于一些特定的应用程序(如图像处理,计算机视觉等)来说,并不需要精确的结果,其对输出具有一定的容忍度。这种情况下就需要设计新型的高速,低功耗且面积小的近似除法器来代替精确除法器,来提高系统综合性能。Existing precise dividers have many disadvantages such as long critical path, large delay, high power consumption, large area, etc. However, for some specific applications (such as image processing, computer vision, etc.), precise results are not required, and there is a certain tolerance for the output. In this case, it is necessary to design a new type of high-speed, low-power and small-area approximate divider to replace the precise divider to improve the overall performance of the system.

对数运算可以将复杂的除法运算转换成简单的一次减法运算,相较于传统的精确阵列除法器,大大缩短了关键路径,降低了延迟,同时在硬件资源上的消耗也大大减少了,大幅度降低了功耗。但缺点就是,其计算精度很差,尤其对于大位宽的操作数来说,误差值更大。为了结合这两者的优点,设计出了近似混合除法器电路结构,其相比于精确除法器来说速度快,面积小,功耗低;相较于全部用对数运算来说,其计算精度又得到了保障。Logarithmic operations can convert complex division operations into simple subtraction operations. Compared with traditional precise array dividers, it greatly shortens the critical path, reduces latency, and greatly reduces the consumption of hardware resources, greatly reducing power consumption. However, its disadvantage is that its calculation accuracy is very poor, especially for operands with large bit widths, the error value is larger. In order to combine the advantages of both, an approximate hybrid divider circuit structure is designed, which is faster, smaller in area, and lower in power consumption than a precise divider; compared with using logarithmic operations entirely, its calculation accuracy is guaranteed.

发明内容:Summary of the invention:

发明目的:为解决上述技术问题,本发明提出一种基于阵列和对数除法器的近似混合除法器电路,该除法器面积小、速度快、功耗低。Purpose of the invention: To solve the above technical problems, the present invention proposes an approximate hybrid divider circuit based on an array and a logarithmic divider, which has a small area, high speed and low power consumption.

技术方案:为实现上述技术效果,本发明提供的技术方案为:Technical solution: To achieve the above technical effects, the technical solution provided by the present invention is:

一种基于阵列和对数除法器的近似混合除法器电路,其特征在于,包括改良后的前导位检测技术、控制调整电路输出的精确度和硬件指标的截断模块、优化后的精确阵列除法器结构以及使用改良后的前导位检测技术的对数除法器模块;该除法器为被除数16bits和除数8bits的除法器,最终的商值结果为16bits,设计中引入近似深度的概念,定义为h,根据用户对具体精确度的要求不同选择不同,范围为8~16,商值结果的16-hbits高位由优化的精确阵列除法器模块生成;商值结果的h bits低位由非精确的对数除法器模块生成;其中,An approximate hybrid divider circuit based on an array and a logarithmic divider is characterized in that it includes an improved leading bit detection technology, a truncation module for controlling the accuracy of the output of an adjustment circuit and hardware indicators, an optimized precise array divider structure, and a logarithmic divider module using the improved leading bit detection technology; the divider is a divider with a dividend of 16 bits and a divisor of 8 bits, and the final quotient result is 16 bits. The concept of approximate depth is introduced in the design and is defined as h. The depth is selected according to different requirements of the user for specific accuracy, and the range is 8 to 16. The 16-hbits high bits of the quotient result are generated by the optimized precise array divider module; the h bits low bits of the quotient result are generated by the inaccurate logarithmic divider module; wherein,

截断模块是控制近似深度h的输入,其值的不同决定了操作数的分配情况,从而可以配置成不同的精度和硬件资源的需求,通过选择合适的近似深度值,以达到计算精度和硬件性能之间的良好折中。由选定好的近似深度h值,将操作数的被除数进行截断,16-hbits的高位分配给精确阵列除法器模块,h bits的低位分配给对数除法器。The truncation module is the input that controls the approximation depth h. Different values of the truncation module determine the allocation of operands, which can be configured to meet different precision and hardware resource requirements. By selecting the appropriate approximation depth value, a good compromise between computational accuracy and hardware performance can be achieved. The dividend of the operand is truncated by the selected approximation depth h value, and the high bits of the 16-h bits are allocated to the precise array divider module, and the low bits of the h bits are allocated to the logarithmic divider.

精确的阵列除法器模块是由多个的精确阵列除法器单元组合构成,每个精确阵列除法器单元是由一个一位全减器和和一个数据选择器构成。通过每一行相减得到的部分余数的正负来判断该为的商值,并由得到的商值进行反馈至上一级控制部分余数是否进入下一位商值的运算。每一行产生一个商值,依次进行操作,最终得到16-h bits的高位商值及8bits的最终余数。The precise array divider module is composed of multiple precise array divider units, each of which is composed of a one-bit full subtractor and a data selector. The quotient value is determined by the positive or negative of the partial remainder obtained by subtracting each row, and the obtained quotient value is fed back to the upper level to control whether the partial remainder enters the calculation of the next quotient value. Each row generates a quotient value, and the operation is performed in sequence to finally obtain the high-order quotient value of 16-h bits and the final remainder of 8 bits.

将由精确阵列除法器得出的8bits的最终余数与截断模块分配给对数除法器的hbits的低位操作数进行串联输入,作为对数除法器的被除数,对数除法器模块首先将操作数进行前导位检测,判断其最高有效“1”的位置,然后对操作数进行二进制到对数的转换,除法转换成对数就等同于减法运算,相减的结果即为商值的对数,再将其进行对数到二进制的转换,最终通过检测的前导“1”的位置,对结果进行简单的移位操作即得到h bits的低位商值,最终将由精确阵列除法器生成的16-h bits的高位商值与对数除法器生成的hbits的低位商值进行串联输出,得到最终的商值结果。The final remainder of 8 bits obtained by the precise array divider is input in series with the low-order operand of hbits assigned to the logarithmic divider by the truncation module as the dividend of the logarithmic divider. The logarithmic divider module first detects the leading bit of the operand to determine the position of its most significant "1", and then converts the operand from binary to logarithm. The conversion of division to logarithm is equivalent to subtraction. The result of subtraction is the logarithm of the quotient value, which is then converted from logarithm to binary. Finally, by detecting the position of the leading "1", a simple shift operation is performed on the result to obtain the low-order quotient value of h bits. Finally, the high-order quotient value of 16-h bits generated by the precise array divider and the low-order quotient value of hbits generated by the logarithmic divider are output in series to obtain the final quotient value result.

本发明还提出了一种基于恢复阵列和对数除法器的近似混合除法器的设计方法,其特征在于,包括步骤:The present invention also proposes a design method of an approximate hybrid divider based on a restoration array and a logarithmic divider, which is characterized by comprising the steps of:

(1)构建近似混合除法器;(1) Construct an approximate hybrid divider;

(2)假设被除数为X,位宽为16bits,除数为Y,位宽为8bits。通过近似深度h对操作数进行截断,16-h bits高位分配给精确阵列除法器,h bits低位分配给对数除法器。精确阵列除法器模块的被除数定义为X1,位宽为16-h bits,产生的最终余数定义为R1,位宽为8bits,产生的高位商值定义为Q1,位宽为16-h bits;分配给对数除法器的部分被除数定义为X2,位宽为h bits,产生的低位商值定义为Q2,位宽为h bits;Q1和Q2的串联输出即得最终商值,定义为Q,位宽为16bits。(2) Assume that the dividend is X with a bit width of 16 bits and the divisor is Y with a bit width of 8 bits. The operand is truncated by approximating the depth h, and the 16-h bits of high bits are assigned to the precise array divider, and the h bits of low bits are assigned to the logarithmic divider. The dividend of the precise array divider module is defined as X 1 with a bit width of 16-h bits, the final remainder generated is defined as R 1 with a bit width of 8 bits, and the high-order quotient value generated is defined as Q 1 with a bit width of 16-h bits; the part of the dividend assigned to the logarithmic divider is defined as X 2 with a bit width of h bits, and the low-order quotient value generated is defined as Q 2 with a bit width of h bits; the series output of Q 1 and Q 2 is the final quotient value, defined as Q, with a bit width of 16 bits.

(3)在精确阵列除法器中,16-h bits的操作数,需要16-h个行单元。每一个行单元进行被除数和除数的相减即X1-Y,从高位到低位,依次顺延。每个行单元产生1bit的高位商值,并反馈回该行单元,通过数据选择器控制其部分余数的输出。若该行得出得商值为“1”,则数据选择器选择部分余数输入到下一行;若该行得出得商值为“0”,则数据选择器选择X1输入到下一行。由精确阵列除法器得出得高位商值Q1的表达式为: (3) In the precise array divider, a 16-h bits operand requires 16-h row units. Each row unit performs the subtraction of the dividend and the divisor, i.e., X1 -Y, from high to low. Each row unit generates a 1-bit high-order quotient value and feeds it back to the row unit, and controls the output of its partial remainder through the data selector. If the quotient value obtained by the row is "1", the data selector selects the partial remainder to be input to the next row; if the quotient value obtained by the row is "0", the data selector selects X1 to be input to the next row. The expression of the high-order quotient value Q1 obtained by the precise array divider is:

(4)假设T=R1×2h+X2,其位宽等于8+h bits。将T作为对数除法器的被除数输入,输出仍然是Y。首先要对操作数进行前导位检测,判断其最高有效位“1”的位置,用k1和k2来表示。然后对操作数进行二进制到对数的转换,可以先将操作数写成如下表达式的形式:其中,k1的位宽为4bits,k2的位宽为3bits。m1和m2是尾数部分,范围为[0,1)。如此,对数除法器产生的低位商值的表达式为:将其进行二进制到对数的转换:log2Q2=k1-k2+log2(1+m1)-log2(1+m2);在数学运算中,当0≤m<1的时候,log2(1+m)≈m,将其代入到前面的表达式中,就可以得到近似的低位商值Q2的表达式:log2Q2≈k1-k2+m1-m2;由表达式可见,只需进行简单的减法运算,得出近似的低位商值的对数值;接着将该计算的对数值进行对数到二进制数的转换得到低位商值表达式为:最后将所得出的Q1和Q2串联输出即得最终商值Q。(4) Assume that T = R 1 × 2 h + X 2 , and its bit width is equal to 8 + h bits. Take T as the dividend input of the logarithmic divider, and the output is still Y. First, the leading bit of the operand must be detected to determine the position of its most significant bit "1", which is represented by k 1 and k 2. Then, the operand is converted from binary to logarithmic. The operand can be written in the form of the following expression: Among them, the bit width of k1 is 4 bits, and the bit width of k2 is 3 bits. m1 and m2 are the mantissa parts, and the range is [0, 1). In this way, the expression of the low-order quotient value generated by the logarithmic divider is: Convert it from binary to logarithm: log 2 Q 2 =k 1 -k 2 +log 2 (1+m 1 )-log 2 (1+m 2 ); in mathematical operations, when 0≤m<1, log 2 (1+m)≈m, and substitute it into the previous expression to obtain the approximate expression of the low-order quotient value Q 2 : log 2 Q 2 ≈k 1 -k 2 +m 1 -m 2 ; It can be seen from the expression that only a simple subtraction operation is required to obtain the approximate logarithm value of the low-order quotient value; then convert the calculated logarithm value to binary to obtain the expression of the low-order quotient value: Finally, the obtained Q1 and Q2 are output in series to obtain the final quotient value Q.

进一步的所述的基于恢复阵列和对数除法器的近似混合除法器的设计方法,其精确阵列除法器模块与传统的阵列除法器不同,对于一个16/8的除法器,并不是每行固定8个精确阵列除法器单元,而是可以根据近似深度的不同可以减少其中冗余的精确阵列除法器单元个数,进入减少硬件资源消耗,进一步降低功耗。具体的实现步骤如下:The design method of the approximate hybrid divider based on the recovery array and the logarithmic divider is further described. The precise array divider module is different from the traditional array divider. For a 16/8 divider, instead of fixing 8 precise array divider units in each row, the number of redundant precise array divider units can be reduced according to the different approximate depths, thereby reducing the consumption of hardware resources and further reducing power consumption. The specific implementation steps are as follows:

(1)对除数Y进行前导位检测,检测出其最高有效“1”的位置信息k;(1) Perform leading bit detection on the divisor Y to detect the position information k of the most significant "1";

(2)通过选定的近似深度值,确定分配给精确阵列除法器模块的被除数位宽为16-h bits,用(16-h)/(16-h+1)的阵列结构来计算精确的高位商值。(2) By using the selected approximate depth value, the dividend bit width assigned to the precise array divider module is determined to be 16-h bits, and the (16-h)/(16-h+1) array structure is used to calculate the precise high-order quotient value.

(3)根据步骤(1)的得出的除数的前导位信息k以及步骤(2)中的阵列结构,若k>16-h+1,则除数Y从最高位起,截取(16-h+1)位作为除数;若k<16-h+1,则除数Y从最低位起,截取(16-h+1)位作为除数。(3) Based on the leading bit information k of the divisor obtained in step (1) and the array structure in step (2), if k>16-h+1, the divisor Y is truncated from the highest bit to (16-h+1) bits as the divisor; if k<16-h+1, the divisor Y is truncated from the lowest bit to (16-h+1) bits as the divisor.

此方法的优势在于,可以有效的减少冗余的硬件资源消耗,进一步降低功耗,提升系统性能,与此同时,其结果仍然是精确值,并不存在误差。The advantage of this method is that it can effectively reduce the consumption of redundant hardware resources, further reduce power consumption, and improve system performance. At the same time, the result is still an accurate value without any error.

进一步的所述的基于恢复阵列和对数除法器的近似混合除法器的设计方法中所述的前导位检测算法与传统的优先编码器不同,所述设计方法中的前导位检测技术步骤如下:The leading bit detection algorithm described in the design method of the approximate hybrid divider based on the recovery array and the logarithmic divider is different from the traditional priority encoder. The leading bit detection technical steps in the design method are as follows:

(1)定义被检测数据位宽为8bits(均为二进制数),数据定义为A=a7a6a5a4a3a2a1a0;可知8bits的前导位检测的数据输出位宽最大为3bits,定义为B=b2b1b0(1) The bit width of the data to be detected is defined as 8 bits (all binary numbers), and the data is defined as A = a 7 a 6 a 5 a 4 a 3 a 2 a 1 a 0 ; it can be seen that the maximum output bit width of the data for the 8-bit leading bit detection is 3 bits, which is defined as B = b 2 b 1 b 0 ;

(2)将a7a6a5a4与0000相比较,若相等,则b2=0;否则,b2=1;(2) Compare a 7 a 6 a 5 a 4 with 0000. If they are equal, b 2 = 0; otherwise, b 2 = 1.

(3)根据步骤(2)的比较结果,若b2=0,则接下来将a3a2与00相比较;若b2=1,则接下来将a7a6与00相比较;比较结果若相等,则b1=0;若不相等,则b1=1;(3) According to the comparison result of step (2), if b 2 = 0, then a 3 a 2 is compared with 00; if b 2 = 1, then a 7 a 6 is compared with 00; if the comparison result is equal, then b 1 = 0; if not equal, then b 1 = 1;

(4)根据步骤(2)(3)的比较结果,若b2b1=00,则接下来将a1与0相比较;若b2b1=01,则接下来将a3与0相比较;若b2b1=10,则接下来将a5与0相比较;若b2b1=11,则接下来将a7与0相比较;比较结果若相等,则b0=0;若不相等,则b0=1;(4) According to the comparison results of steps (2) and (3), if b 2 b 1 = 00, then a 1 is compared with 0; if b 2 b 1 = 01, then a 3 is compared with 0; if b 2 b 1 = 10, then a 5 is compared with 0; if b 2 b 1 = 11, then a 7 is compared with 0; if the comparison results are equal, then b 0 = 0; if not equal, then b 0 = 1;

(5)根据步骤(2)(3)(4)的比较结果,即得出最终前导位“1”的位置。(5) Based on the comparison results of steps (2), (3), and (4), the position of the final leading bit "1" is obtained.

附图说明:Description of the drawings:

图1为基于恢复阵列和对数除法器的近似混合除法器整体硬件实现图;FIG1 is a diagram showing the overall hardware implementation of an approximate hybrid divider based on a recovery array and a logarithmic divider;

图2为精确阵列除法器单元电路图;Fig. 2 is a circuit diagram of a precision array divider unit;

图3为8/4的精确阵列除法器电路结构图;FIG3 is a circuit diagram of an 8/4 precision array divider;

图4为对数除法器的运算实现结构图;FIG4 is a block diagram of the operation realization of a logarithmic divider;

图5为设计中当h=14时改进的阵列结构示意图;FIG5 is a schematic diagram of the improved array structure when h=14 in the design;

图6为h=14的16/8位的近似混合除法器的硬件实现图。FIG6 is a hardware implementation diagram of a 16/8-bit approximate mixed divider with h=14.

图7为h=6的8/4位的近似混合除法器的数据流程图。FIG. 7 is a data flow diagram of an 8/4-bit approximate mixed divider with h=6.

具体实施方式:Specific implementation method:

下面以h=14时为例,即商值结果的高2位为精确设计,而低14位为近似设计的16/8位的近似混合除法器。结合附图对本发明的技术方案做进一步的详细说明:Take h=14 as an example, that is, the upper 2 bits of the quotient result are precisely designed, while the lower 14 bits are approximately designed for a 16/8 bit approximate mixed divider. The technical solution of the present invention is further described in detail in conjunction with the accompanying drawings:

如图6为本发明的基于恢复阵列和对数除法器的近似混合除法器在h=14时的硬件实现图,其中包含优化的精确阵列除法器模块,前导位检测模块,控制调整电路输出的精确度和硬件指标的截断模块,二进制转对数模块,减法单元模块以及对数转二进制模块。As shown in Figure 6, the hardware implementation diagram of the approximate hybrid divider based on the recovery array and the logarithmic divider of the present invention when h=14 includes an optimized precise array divider module, a leading bit detection module, a truncation module for controlling the accuracy of the adjustment circuit output and the hardware indicators, a binary to logarithmic module, a subtraction unit module and a logarithmic to binary module.

其中优化的精确阵列除法器模块,只用了6个精确阵列除法器单元,而传统的精确除法器的需要16个,图6中打×即为冗余的精确阵列除法器单元。首先对除数进行前导位检测,得出其最高有效位的位置信息k2。由于被除数的高位2bits被分配给了精确阵列除法器,因此只需要用2/3位的精确阵列即可得到精确结果,根据得到的k2的值与3进行比较,若大于3,则将除数从高位起选择3bits作为除数;若小于3,则将除数从低位起选择3bits作为除数。其通过运算产生了2bits精确的高位商值,以及产生了3bits的余数,但由于被除数为2bits,故余数的有效位只有后两位,第一位为0。The optimized precise array divider module only uses 6 precise array divider units, while the traditional precise divider requires 16. The redundancy precise array divider units are marked with × in Figure 6. First, the leading bit of the divisor is detected to obtain the position information of its most significant bit k 2 . Since the high 2 bits of the dividend are allocated to the precise array divider, only 2/3 bits of the precise array are needed to obtain the precise result. The obtained value of k 2 is compared with 3. If it is greater than 3, 3 bits of the divisor are selected from the high position as the divisor; if it is less than 3, 3 bits of the divisor are selected from the low position as the divisor. Through the operation, a 2-bit precise high-order quotient value and a 3-bit remainder are generated. However, since the dividend is 2 bits, the significant bits of the remainder are only the last two bits, and the first bit is 0.

然后将得到的2bits的有效余数与分配给对数除法器的14bits部分被除数串联输入,作为对数除法器的被除数。接着对被除数进行前导位检测,得出其最高有效位的位置信息k1,将两操作数进行二进制到对数的转换,得出m1和m2。接下来将k1和m1进行串联,将k2和m2进行串联,k1和k2作为整数部分,m1和m2作为小数部分,进行减法运算k1+m1-k2+m2,得出差值结果,该结果再加1,然后进行简单的移位操作,移位k1-k2位,若k1-k2>0,则进行左移;若k1-k2<0,则进行右移。最后得出14bits近似的低位商值。Then the obtained 2-bit effective remainder and the 14-bit dividend assigned to the logarithmic divider are input in series as the dividend of the logarithmic divider. Then the leading bit of the dividend is detected to obtain the position information of its most significant bit k 1 , and the two operands are converted from binary to logarithmic to obtain m 1 and m 2. Next, k 1 and m 1 are connected in series, k 2 and m 2 are connected in series, k 1 and k 2 are used as the integer part, m 1 and m 2 are used as the decimal part, and the subtraction operation k 1 +m 1 -k 2 +m 2 is performed to obtain the difference result, and the result is added by 1, and then a simple shift operation is performed, shifting k 1 -k 2 bits. If k 1 -k 2 > 0, a left shift is performed; if k 1 -k 2 < 0, a right shift is performed. Finally, the 14-bit approximate low-order quotient value is obtained.

将由精确的阵列除法器产生的2bits精确的高位商值与由对数除法器产生的14bits近似的低位商值进行串联输出,即得到最终的商值结果。The 2-bit accurate high-order quotient value generated by the accurate array divider and the 14-bit approximate low-order quotient value generated by the logarithmic divider are connected in series and output to obtain the final quotient result.

本发明还给出了8/4的近似混合除法器的应用。图7给出了h=6时的数据流程图。The present invention also provides an application of an 8/4 approximate mixed divider. Figure 7 provides a data flow chart when h=6.

以上只是对本发明的优选实施方式进行了描述。dui该技术领域的普通技术人员来说,根据以上实施方式可以很容易地联想到其它的优点和变形。因此,本发明并不局限于上述实施方式,其仅仅作为例子对本发明的一种形态进行详细、示范性的说明。在不背离本发明宗旨的范围内,本领域普通技术人员在本发明技术的方案范围内进行的通常变化和替换,都应包含在本发明的保护范围之内。The above is only a description of the preferred embodiment of the present invention. For those skilled in the art, other advantages and variations can be easily associated with the above embodiment. Therefore, the present invention is not limited to the above embodiment, which is only used as an example to provide a detailed and exemplary description of one form of the present invention. Without departing from the scope of the purpose of the present invention, the usual changes and substitutions made by those skilled in the art within the scope of the technical solution of the present invention should be included in the protection scope of the present invention.

Claims (4)

1. The mixed approximate divider circuit based on the array divider and the logarithmic divider is characterized by comprising a cut-off module for controlling and adjusting the accuracy and hardware index of the circuit output, an optimized accurate array divider module and a logarithmic divider module using an improved leading bit detection technology; the divider circuit is a divider with divisor 16bits and divisor 8bits, the final quotient result is 16bits, the 16-h bits high bit of the quotient result of the divider circuit is generated by the precise array divider module, h is approximate depth, and the range is 8-16; h bits low bits of the quotient result of the divider circuit are generated by the logarithmic divider module;
The truncation module is used for controlling the input of the approximate depth h, and the distribution condition of operands is determined by the difference of values of the truncation module, so that the truncation module is configured into different precision and hardware resource requirements, and a good compromise between the calculation precision and the hardware performance is achieved by selecting a proper approximate depth value; cutting off the divisor of the operand by the selected approximate depth h value, distributing the high order of 16-h bits to the precise array divider module, and distributing the low order of h bits to the logarithmic divider module;
The precise array divider module is formed by combining a plurality of precise array divider units, and each precise array divider unit is formed by a one-bit full-dimmer and a data selector; the accurate array divider module judges the quotient value of the corresponding bit through the positive and negative of the partial remainder obtained by subtracting each row, and feeds back the obtained quotient value to the upper stage to control whether the partial remainder enters the operation of the next quotient value; generating a quotient value by each row of the precise array divider units, and sequentially operating to finally obtain a high-order quotient value of 16-h bits and a final remainder of 8 bits;
Inputting the final remainder of 8bits obtained by the precise array divider and the low-order operand of hbits distributed to the logarithmic divider module by the truncation module in series as the dividend of the logarithmic divider module; the logarithmic divider module firstly detects leading bits of operands, judges the position of the most effective 1, then carries out binary-to-logarithmic conversion on the operands, realizes the operation of converting division into logarithms through subtraction operation, namely the logarithm of quotient values, then carries out the conversion of the logarithm to binary values on the subtraction operation, finally carries out shift operation on the binary subtraction operation result through the detected position of the leading 1, namely the low-order quotient value of hbits, and finally carries out serial output on the high-order quotient value of 16-hbits generated by the precise array divider module and the low-order quotient value of h bits generated by the logarithmic divider module, thus obtaining the final quotient value result.
2. The design method of the approximate hybrid divider based on the recovery array and the logarithmic divider is characterized by comprising the following steps:
(1) Constructing an approximate hybrid divider circuit as defined in claim 1;
(2) Assuming that the dividend is X, the bit width is 16bits, the divisor is Y, and the bit width is 8bits; truncating an operand by an approximate depth h, distributing 16-h bits high bits to the precise array divider module, and distributing h bits low bits to the logarithmic divider module; the divisor of the precise array divider module is defined as X1, the bit width is 16-h bits, the generated final remainder is defined as R1, the bit width is 8bits, the generated high-order quotient is defined as Q1, and the bit width is 16-h bits; the partial dividend allocated to the logarithmic divider module is defined as X2, the bit width is h bits, the generated low-order quotient value is defined as Q2, and the bit width is hbits; the final quotient value is obtained through the serial output of Q1 and Q2, and is defined as Q, and the bit width is 16bits;
(3) In the precise array divider module, 16-h bit operands require 16-h row units, and each row unit performs subtraction of divisors and divisors, namely X1-Y, and sequentially extends from high order to low order; each row unit generates a high-order quotient value of 1bit, and feeds back the high-order quotient value to the row unit, and the output of partial remainder is controlled through a data selector; if the line obtains a quotient value of 1, the data selector selects a part of remainder to be input to the next line; if the quotient obtained by the row is 0, the data selector selects X1 to be input to the next row; the expression of the high quotient Q1 obtained by the precise array divider module is as follows:
(4) Let t=r 1×2h+X2, the bit width of T be equal to 8+h bits; taking T as the dividend input of the logarithmic divider module, the divisor output still being Y; firstly, leading bit detection is carried out on an operand, and the position of the most significant bit '1' is judged, and is expressed by k1 and k 2; the operands are then binary-to-logarithmic converted, and the operands may be written first in the form of the following expression:
wherein k1 has a bit width of 4bits, k2 has a bit width of 3bits, m1 and m2 are mantissa portions, and the range is [0, 1); thus, the logarithmic divider module generates a low quotient having the following expression:
Binary to logarithmic conversion of Q 2: log2 q2=k1-k2+log2 (1+m1) -log2 (1+m2); when 0.ltoreq.m < 1, log2 (1+m). Apprxeq.m is substituted into the previous expression, the expression of the approximate low quotient Q2 can be obtained: log2 q2=k1-k2+m1-m 2; then log2Q2 is converted from logarithm to binary number, and the low-order quotient expression is obtained as follows:
and finally, outputting the obtained Q1 and Q2 in series to obtain a final quotient value Q.
3. The method for designing an approximate hybrid divider based on a recovery array and logarithmic divider according to claim 2, wherein the exact array divider module reduces the number of exact array divider units redundant therein according to the difference of approximate depths, and the specific implementation steps are as follows:
leading bit detection is carried out on the divisor Y, and the position information k of the most effective '1' is detected;
Determining that the divisor bit width allocated to the precise array divider module is 16-h bits by the selected approximate depth value, and calculating a precise high-order quotient by using the array structure of (16-h)/(16-h+1);
according to the position information k and the array structure, if k is more than 16-h+1, the divisor Y is from the highest bit, and the (16-h+1) bit is intercepted as the divisor; if k < 16-h+1, then the divisor Y intercepts the (16-h+1) bits from the lowest order as the divisor.
4. The method for designing an approximate hybrid divider based on a recovery array and a logarithmic divider of claim 3, wherein the preamble detection specifically comprises:
Defining the detected data bit width as 8bits, and defining the data as a=a7a6a6a5a4a3a2a1a0; the maximum data output bit width of the 8bits preamble bit detection is 3bits, which is defined as b=b2b1b0;
comparing a7a6a5a4 with 0000, and if equal, b2=0; otherwise, b2=1;
If b2=0, then a3a2 is next compared to 00; if b2=1, then a7a6 is next compared to 00; if the comparison results are equal, b1=0; if not, b1=1;
If b2b1=00, then a1 is next compared to 0; if b2b1=01, then a3 is next compared to 0; if b2b1=10, then a5 is next compared to 0; if b2b1=11, then a7 is next compared to 0; if the comparison results are equal, b0=0; if not, b0=1;
based on the result of b2b1b0, the position of the final leading bit "1" is obtained.
CN202010126403.6A 2020-02-27 2020-02-27 Approximate hybrid divider circuit based on array and logarithmic divider Active CN113312021B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010126403.6A CN113312021B (en) 2020-02-27 2020-02-27 Approximate hybrid divider circuit based on array and logarithmic divider

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010126403.6A CN113312021B (en) 2020-02-27 2020-02-27 Approximate hybrid divider circuit based on array and logarithmic divider

Publications (2)

Publication Number Publication Date
CN113312021A CN113312021A (en) 2021-08-27
CN113312021B true CN113312021B (en) 2024-08-09

Family

ID=77370510

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010126403.6A Active CN113312021B (en) 2020-02-27 2020-02-27 Approximate hybrid divider circuit based on array and logarithmic divider

Country Status (1)

Country Link
CN (1) CN113312021B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115407965B (en) * 2022-11-01 2023-03-24 南京航空航天大学 A High Performance Approximate Divider and Error Compensation Method Based on Taylor Expansion

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1049922A (en) * 1989-09-02 1991-03-13 清华大学 A Redundancy Code High Speed Array Divider
US5272660A (en) * 1992-06-01 1993-12-21 Motorola, Inc. Method and apparatus for performing integer and floating point division using a single SRT divider in a data processor
US6549926B1 (en) * 1999-10-26 2003-04-15 Sun Microsystems, Inc. SRT divider having several bits of each partial remainder one-hot encoded to minimize the logic levels needed to estimate quotient bits
US20140195581A1 (en) * 2013-01-08 2014-07-10 Analog Devices, Inc. Fixed point division circuit utilizing floating point architecture

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Design of Unsigned Approximate Hybrid Dividers Based on Restoring Array and Logarithmic Dividers;WEIQIANG LIU;IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING;20200908;339-347 *

Also Published As

Publication number Publication date
CN113312021A (en) 2021-08-27

Similar Documents

Publication Publication Date Title
US20220350567A1 (en) Arithmetic logic unit, floating-point number multiplication calculation method, and device
CN109739555A (en) Chip, terminal and control method including multiply-accumulate module
US9495131B2 (en) Multi-input and binary reproducible, high bandwidth floating point adder in a collective network
CN101140511A (en) Cascaded carry binary adder
CN113312021B (en) Approximate hybrid divider circuit based on array and logarithmic divider
CN118312129A (en) SRT operation circuit
US20040267853A1 (en) Method and apparatus for implementing power of two floating point estimation
CN102004627B (en) Multiplication rounding implementation method and device
Yan et al. An energy-efficient multiplier with fully overlapped partial products reduction and final addition
CN110837624A (en) An approximate computing device for sigmoid function
KR20020054203A (en) Turbo decoder using the binary LogMAP algorithm and its embodiment method
WO2022170811A1 (en) Fixed-point multiply-add operation unit and method suitable for mixed-precision neural network
CN113837365A (en) Model for realizing sigmoid function approximation, FPGA circuit and working method
JPH05204602A (en) Control signal method and apparatus
CN115062768B (en) Softmax hardware implementation method and system of logic resource limited platform
CN117313173A (en) Modular multiplication operation method, modular multiplication module and homomorphic processing unit
US20060143260A1 (en) Low-power booth array multiplier with bypass circuits
US20210141601A1 (en) Digital calculation processing circuit
JPH086766A (en) Sine cosine calculator
CN115796197A (en) A Logarithm-Based Frequency and Precision Reconfigurable Approximate Floating-Point Multiplier
US20170344342A1 (en) Rounding circuitry and method
CN115526131A (en) Method and device for approximately calculating Tanh function by multi-level coding
US8316249B2 (en) Variable scaling for computing elements
Immaneni et al. PosAx-O: Exploring Operator-level Approximations for Posit Arithmetic in Embedded AI/ML
CN117667008B (en) Rounding method, rounding system, rounding circuit and rounding computer equipment for decimal operation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant