[go: up one dir, main page]

CN107103113A - Towards the Automation Design method, device and the optimization method of neural network processor - Google Patents

Towards the Automation Design method, device and the optimization method of neural network processor Download PDF

Info

Publication number
CN107103113A
CN107103113A CN201710178281.3A CN201710178281A CN107103113A CN 107103113 A CN107103113 A CN 107103113A CN 201710178281 A CN201710178281 A CN 201710178281A CN 107103113 A CN107103113 A CN 107103113A
Authority
CN
China
Prior art keywords
neural network
unit
data
hardware
network processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710178281.3A
Other languages
Chinese (zh)
Other versions
CN107103113B (en
Inventor
韩银和
许浩博
王颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201710178281.3A priority Critical patent/CN107103113B/en
Publication of CN107103113A publication Critical patent/CN107103113A/en
Priority to PCT/CN2018/080207 priority patent/WO2018171717A1/en
Application granted granted Critical
Publication of CN107103113B publication Critical patent/CN107103113B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Geometry (AREA)
  • Devices For Executing Special Programs (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

本发明提出一种面向神经网络处理器的自动化设计方法、装置及优化方法,该方法包括步骤1,获取神经网络模型描述文件、硬件资源约束参数,其中所述硬件资源约束参数包括硬件资源大小及目标运行速度;步骤2,根据所述神经网络模型描述文件与所述硬件资源约束参数,从已构建的神经网络组件库中查找单元库,并根据所述单元库生成对应于所述神经网络模型的神经网络处理器的硬件描述语言代码;步骤3,将所述硬件描述语言代码转化为所述神经网络处理器的硬件电路。

The present invention proposes a neural network processor-oriented automatic design method, device and optimization method. The method includes step 1, obtaining a neural network model description file and hardware resource constraint parameters, wherein the hardware resource constraint parameters include hardware resource size and Target running speed; step 2, according to the neural network model description file and the hardware resource constraint parameters, search the unit library from the built neural network component library, and generate the corresponding neural network model according to the unit library The hardware description language code of the neural network processor; step 3, converting the hardware description language code into the hardware circuit of the neural network processor.

Description

面向神经网络处理器的自动化设计方法、装置及优化方法Automatic design method, device and optimization method for neural network processor

技术领域technical field

本发明涉及神经网络处理器体系结构技术领域,特别涉及面向神经网络处理器的自动化设计方法、装置及优化方法。The invention relates to the technical field of neural network processor architecture, in particular to an automatic design method, device and optimization method for neural network processors.

背景技术Background technique

深度学习及神经网络技术的飞速发展为大规模数据处理任务提供了新的解决途径,各种新型神经网络模型在处理复杂抽象问题上有着出色表现,其在视觉图像处理、语音识别及智能机器人等领域的新型应用层出不穷。The rapid development of deep learning and neural network technology provides new solutions for large-scale data processing tasks. Various new neural network models have excellent performance in dealing with complex and abstract problems. New applications in the field emerge in an endless stream.

目前利用深度神经网络进行实时任务分析大多依靠大规模高性能处理器或通用图形处理器,这些设备成本高功耗大,面向便携式智能设备应用时,存在电路规模大、能量消耗高和产品价格昂贵等一系列问题。因此,针对嵌入式设备及小型低成本数据中心等应用领域中高能效实时处理的应用,采用专用神经网络处理器加速而不是软件的方式进行神经网络模型计算成为一种更有效的解决方案,然而神经网络模型的拓扑结构及参数设计会根据不同的应用场景而改变,另外神经网络模型的发展更迭速度很快,提供一种可以面向各种应用场景并覆盖各种神经网络模型的通用高效神经网络处理器非常困难,这为高层应用开发者针对不同应用需求设计硬件加速解决方案带来了极大不变。At present, the real-time task analysis using deep neural networks mostly relies on large-scale high-performance processors or general-purpose graphics processors. These devices have high cost and high power consumption. When they are applied to portable smart devices, they have large circuit scale, high energy consumption and expensive products. And so on a series of questions. Therefore, for energy-efficient real-time processing applications in embedded devices and small low-cost data centers, it is a more effective solution to use dedicated neural network processors to accelerate neural network model calculations instead of software. The topology and parameter design of the network model will change according to different application scenarios. In addition, the development and change of the neural network model is very fast, providing a general and efficient neural network processing that can face various application scenarios and cover various neural network models. It is very difficult for high-level application developers to design hardware acceleration solutions for different application requirements.

目前现有的神经网络硬件加速技术包括专用集成电路(Application SpecificIntegrated Circuit,ASIC)芯片和现场可编程门阵列(Field Programmable Gate Array,FPGA)两种方式。在同等工艺条件下,ASIC芯片运行速度快且功耗低,但设计流程复杂、投片周期长、开发成本高,无法适应神经网络模型快速更新的特点;FPGA具有电路配置灵活、开发周期短的特点,但运行速度相对低,硬件开销及功耗相对较大。无论采用上述哪种硬件加速技术,均需要神经网络模型及算法开发人员在了解网络拓扑和数据流模式的同时掌握硬件开发技术,包括处理器架构设计、硬件代码编写、仿真验证及布局布线等环节,这些技术对专注于研究神经网络模型及结构设计、而不具备硬件设计能力的高层应用开发人员而言开发难度较高。因此,为了使高层开发者高效地进行神经网络技术应用开发,提供一种面向多种神经网络模型的神经网络处理器自动化设计方法及工具是非常迫切的。Currently, existing neural network hardware acceleration technologies include two methods, Application Specific Integrated Circuit (ASIC) chip and Field Programmable Gate Array (Field Programmable Gate Array, FPGA). Under the same process conditions, ASIC chips run fast and consume low power consumption, but the design process is complicated, the casting cycle is long, and the development cost is high, which cannot adapt to the characteristics of rapid update of neural network models; FPGA has the characteristics of flexible circuit configuration and short development cycle. features, but the running speed is relatively low, and the hardware overhead and power consumption are relatively large. No matter which of the above hardware acceleration technologies is used, it is necessary for the neural network model and algorithm developers to understand the network topology and data flow mode while mastering the hardware development technology, including processor architecture design, hardware code writing, simulation verification, layout and wiring, etc. , these technologies are more difficult to develop for high-level application developers who focus on researching neural network models and structural design, but do not have hardware design capabilities. Therefore, in order to enable high-level developers to efficiently develop neural network technology applications, it is very urgent to provide an automatic design method and tool for neural network processors oriented to multiple neural network models.

发明内容Contents of the invention

针对现有技术的不足,本发明提出面向神经网络处理器的自动化设计方法、装置及优化方法。Aiming at the deficiencies of the prior art, the present invention proposes an automatic design method, device and optimization method for neural network processors.

本发明提出一种面向神经网络处理器的自动化设计方法,包括:The present invention proposes an automatic design method for neural network processors, including:

步骤1,获取神经网络模型描述文件、硬件资源约束参数,其中所述硬件资源约束参数包括硬件资源大小及目标运行速度;Step 1, obtaining the neural network model description file and hardware resource constraint parameters, wherein the hardware resource constraint parameters include hardware resource size and target operating speed;

步骤2,根据所述神经网络模型描述文件与所述硬件资源约束参数,从已构建的神经网络组件库中查找单元库,并根据所述单元库生成对应于所述神经网络模型的神经网络处理器的硬件描述语言代码;Step 2, according to the neural network model description file and the hardware resource constraint parameters, search the unit library from the built neural network component library, and generate the neural network processing corresponding to the neural network model according to the unit library The hardware description language code of the device;

步骤3,将所述硬件描述语言代码转化为所述神经网络处理器的硬件电路。Step 3, converting the hardware description language code into the hardware circuit of the neural network processor.

所述神经网络处理器包括存储结构、控制结构、计算结构。The neural network processor includes a storage structure, a control structure, and a calculation structure.

所述神经网络模型描述文件包括基本属性、参数描述与连接信息三部分,其中所述基本属性包括层名称与层类型,所述参数描述包括输出层数、卷积核大小与步进大小,所述连接信息包括连接名称、连接方向、连接类型。The neural network model description file includes three parts: basic attributes, parameter descriptions, and connection information, wherein the basic attributes include layer names and layer types, and the parameter descriptions include the number of output layers, convolution kernel size, and step size. The above connection information includes connection name, connection direction and connection type.

所述神经网络可复用单元库包括硬件描述文件及配置脚本两部分。The neural network reusable unit library includes two parts: a hardware description file and a configuration script.

所述神经网络可复用单元库包括神经元单元、累加器单元、池化单元、分类器单元、局部响应归一化单元、查找表单元、地址生成单元、控制单元。The neural network reusable unit library includes a neuron unit, an accumulator unit, a pooling unit, a classifier unit, a local response normalization unit, a lookup table unit, an address generation unit, and a control unit.

所述神经网络处理器包括主地址生成单元、数据地址生成单元与权重地址生成单元。The neural network processor includes a main address generation unit, a data address generation unit and a weight address generation unit.

还包括根据用户指定的神经网络模型与硬件资源约束参数确定数据路径,并根据神经网络中间层特征确定数据资源共享方式;It also includes determining the data path according to the user-specified neural network model and hardware resource constraint parameters, and determining the data resource sharing method according to the characteristics of the middle layer of the neural network;

根据硬件配置与网络特征生成存储器的地址访问流,所述地址访问流通过有限状态机的方式描述;Generate an address access flow of the memory according to the hardware configuration and network characteristics, and the address access flow is described by means of a finite state machine;

生成硬件描述语言代码,进而转化为所述神经网络处理器的硬件电路。Generate a hardware description language code, and then convert it into a hardware circuit of the neural network processor.

还包括根据所述神经网络模型、所述硬件资源约束参数、所述硬件描述语言代码,生成数据存储映射与控制指令流。It also includes generating a data storage map and a control instruction flow according to the neural network model, the hardware resource constraint parameters, and the hardware description language code.

本发明还包括一种面向神经网络处理器的自动化设计装置,包括:The present invention also includes a neural network processor-oriented automatic design device, comprising:

获取数据模块,用于获取神经网络模型描述文件、硬件资源约束参数,其中所述硬件资源约束参数包括硬件资源大小及目标运行速度;Obtaining a data module, configured to obtain a neural network model description file and hardware resource constraint parameters, wherein the hardware resource constraint parameters include hardware resource size and target operating speed;

生成硬件描述语言代码模块,用于根据所述神经网络模型描述文件与所述硬件资源约束参数,从已构建的神经网络组件库中查找单元库,并根据所述单元库生成对应于所述神经网络模型的神经网络处理器的硬件描述语言代码;Generate a hardware description language code module, which is used to search for a unit library from the constructed neural network component library according to the neural network model description file and the hardware resource constraint parameters, and generate a module corresponding to the neural network according to the unit library. The hardware description language code of the neural network processor of the network model;

生成硬件电路模块,用于将所述硬件描述语言代码转化为所述神经网络处理器的硬件电路。A hardware circuit module is generated for converting the hardware description language code into a hardware circuit of the neural network processor.

所述神经网络处理器包括存储结构、控制结构、计算结构。The neural network processor includes a storage structure, a control structure, and a calculation structure.

所述神经网络模型描述文件包括基本属性、参数描述与连接信息三部分,其中所述基本属性包括层名称与层类型,所述参数描述包括输出层数、卷积核大小与步进大小,所述连接信息包括连接名称、连接方向、连接类型。The neural network model description file includes three parts: basic attributes, parameter descriptions, and connection information, wherein the basic attributes include layer names and layer types, and the parameter descriptions include the number of output layers, convolution kernel size, and step size. The above connection information includes connection name, connection direction and connection type.

所述神经网络可复用单元库包括硬件描述文件及配置脚本两部分。The neural network reusable unit library includes two parts: a hardware description file and a configuration script.

所述神经网络可复用单元库包括神经元单元、累加器单元、池化单元、分类器单元、局部响应归一化单元、查找表单元、地址生成单元、控制单元。The neural network reusable unit library includes a neuron unit, an accumulator unit, a pooling unit, a classifier unit, a local response normalization unit, a lookup table unit, an address generation unit, and a control unit.

所述神经网络处理器包括主地址生成单元、数据地址生成单元与权重地址生成单元。The neural network processor includes a main address generation unit, a data address generation unit and a weight address generation unit.

还包括根据用户指定的神经网络模型与硬件资源约束参数确定数据路径,并根据神经网络中间层特征确定数据资源共享方式;It also includes determining the data path according to the user-specified neural network model and hardware resource constraint parameters, and determining the data resource sharing method according to the characteristics of the middle layer of the neural network;

根据硬件配置与网络特征生成存储器的地址访问流,所述地址访问流通过有限状态机的方式描述;Generate an address access flow of the memory according to the hardware configuration and network characteristics, and the address access flow is described by means of a finite state machine;

生成硬件描述语言代码,进而转化为所述神经网络处理器的硬件电路。Generate a hardware description language code, and then convert it into a hardware circuit of the neural network processor.

还包括根据所述神经网络模型、所述硬件资源约束参数、所述硬件描述语言代码,生成数据存储映射与控制指令流。It also includes generating a data storage map and a control instruction flow according to the neural network model, the hardware resource constraint parameters, and the hardware description language code.

本发明还提出一种基于如所述的一种面向神经网络处理器的自动化设计方法的优化方法,包括:The present invention also proposes an optimization method based on a neural network processor-oriented automatic design method as described, including:

步骤1,定义卷积核大小为k*k,步进为s,存储器宽度为d,数据图张数为t,如果k^2=d^2,将数据划分为k*k大小的数据块,数据宽度与存储器宽度一致,保证数据在存储器中连续存储;Step 1, define the convolution kernel size as k*k, stepping as s, memory width as d, number of data images as t, if k^2=d^2, divide the data into k*k size data blocks , the data width is consistent with the memory width, ensuring that the data is stored continuously in the memory;

步骤2,如果k^2!=d^2,并且步进s是k与存储器宽度d的最大公约数,将数据划分为s*s大小的数据块,保证在一张数据图中数据在存储器中连续存储;Step 2, if k^2! =d^2, and stepping s is the greatest common divisor of k and memory width d, divide the data into data blocks of s*s size, and ensure that the data in a data map is continuously stored in the memory;

步骤3,若以上两项都不满足,则求出步进s、k、存储器宽度d的最大公约数f,将数据分割为大小为f*f的数据块,t张数据图交替存储。Step 3, if the above two items are not satisfied, find the greatest common divisor f of step s, k, and memory width d, divide the data into data blocks of size f*f, and store t data maps alternately.

由以上方案可知,本发明的优点在于:As can be seen from the above scheme, the present invention has the advantages of:

本发明可以将神经网络模型映射为硬件电路并根据硬件资源约束和网络特征自动优化电路结构及数据存储方式,同时生成相应的控制指令流,实现了神经网络硬件加速器的硬件及软件自动化协同设计,在缩短神经网络处理器设计周期的同时提高了神经网络处理器运算能效。The present invention can map the neural network model into a hardware circuit, automatically optimize the circuit structure and data storage mode according to hardware resource constraints and network characteristics, and generate corresponding control instruction streams at the same time, realizing the hardware and software automatic collaborative design of the neural network hardware accelerator, While shortening the design cycle of the neural network processor, the computing energy efficiency of the neural network processor is improved.

附图说明Description of drawings

图1是本发明提供的神经网络处理器的FPGA自动实现工具工作流程图;Fig. 1 is the FPGA automatic implementation tool work flow chart of neural network processor provided by the present invention;

图2是本发明本发明可自动生成的神经网络处理器系统示意图;Fig. 2 is a schematic diagram of a neural network processor system that can be automatically generated by the present invention;

图3是本发明采用的神经网络可复用单元库示意图;Fig. 3 is a schematic diagram of the neural network reusable unit library used in the present invention;

图4是本发明采用的地址生成电路接口示意图。Fig. 4 is a schematic diagram of the interface of the address generation circuit used in the present invention.

具体实施方式detailed description

为了使本发明的目的、技术方案、设计方法及优点更加清楚明了,以下结合附图通过具体实施例对本发明进一步详细说明,应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限定本发明。In order to make the object, technical solution, design method and advantages of the present invention clearer, the present invention will be further described in detail through specific embodiments in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to explain the present invention, and It is not intended to limit the invention.

本发明旨在提供面向神经网络处理器的自动化设计方法、装置及优化方法,该装置包括一硬件生成器和一编译器,所述硬件生成器可根据神经网络类型及硬件资源约束自动生成神经网络处理器的硬件描述语言代码,随后设计人员利用已有硬件电路设计方法通过硬件描述语言生成处理器硬件电路;所述编译器可根据神经网络处理器电路结构生成控制和数据调度指令流。The present invention aims to provide an automatic design method, device and optimization method for neural network processors, the device includes a hardware generator and a compiler, and the hardware generator can automatically generate a neural network according to the neural network type and hardware resource constraints The hardware description language code of the processor, and then the designer uses the existing hardware circuit design method to generate the processor hardware circuit through the hardware description language; the compiler can generate the control and data scheduling instruction flow according to the neural network processor circuit structure.

图1为本发明提供的神经网络处理器自动化生成技术示意图,具体步骤为:Fig. 1 is a schematic diagram of the neural network processor automatic generation technology provided by the present invention, and the specific steps are:

步骤1,本发明装置读取神经网络模型描述文件,描述文件内包括网络拓扑结构和各个运算层定义;Step 1, the device of the present invention reads the neural network model description file, which includes the network topology and the definitions of each computing layer;

步骤2,本发明装置读入硬件资源约束参数,硬件约束参数包括硬件资源大小及目标运行速度等,本发明装置可根据硬件约束参数生成相应的电路结构;Step 2, the device of the present invention reads in hardware resource constraint parameters, which include hardware resource size and target operating speed, etc., and the device of the present invention can generate a corresponding circuit structure according to the hardware constraint parameters;

步骤3,本发明装置根据所述神经网络模型描述脚本和硬件资源约束从已经构建好的神经网络组件库中索引适合的单元库,该工具所包含的硬件电路生成器利用上述单元库生成对应该神经网络模型的神经网络处理器硬件描述语言代码;Step 3, the device of the present invention indexes a suitable unit library from the constructed neural network component library according to the neural network model description script and hardware resource constraints, and the hardware circuit generator contained in the tool uses the above-mentioned unit library to generate a corresponding Neural Network Processor Hardware Description Language code for the neural network model;

步骤4,本发明装置所包含的编译器根据神经网络模型、逻辑资源约束及生成的硬件描述语言代码生成数据存储映射和控制指令流;Step 4, the compiler included in the device of the present invention generates data storage mapping and control instruction flow according to the neural network model, logic resource constraints and generated hardware description language code;

步骤5,通过已有硬件设计方法将硬件描述语言转化为硬件电路。Step 5, convert the hardware description language into a hardware circuit through the existing hardware design method.

本发明可自动生成的神经网络处理器基于存储-控制-计算的结构;The automatically generated neural network processor of the present invention is based on the storage-control-computation structure;

存储结构用于存储参与计算的数据、神经网络权重及处理器操作指令;The storage structure is used to store data involved in calculations, neural network weights and processor operation instructions;

控制结构包括译码电路与控制逻辑电路,用于解析操作指令,生成控制信号,该信号用于控制片上数据的调度与存储以及神经网络计算过程;The control structure includes a decoding circuit and a control logic circuit, which are used to analyze operation instructions and generate control signals, which are used to control the scheduling and storage of on-chip data and the neural network calculation process;

计算结构包括计算单元,用于参与该处理器中的神经网络计算操作。The computing structure includes computing units for participating in neural network computing operations in the processor.

图2为本发明可自动生成的神经网络处理器系统101示意图,该神经网络处理器系统101架构由七个部分构成,包括输入数据存储单元102、控制单元103、输出数据存储单元104、权重存储单元105、指令存储单元106、计算单元107。Fig. 2 is a schematic diagram of the neural network processor system 101 that can be automatically generated in the present invention. The architecture of the neural network processor system 101 is composed of seven parts, including an input data storage unit 102, a control unit 103, an output data storage unit 104, and a weight storage unit. Unit 105 , instruction storage unit 106 , calculation unit 107 .

输入数据存储单元102用于存储参与计算的数据,该数据包括原始特征图数据和参与中间层计算的数据;输出数据存储单元104存储计算得到的神经元响应值;指令存储单元106存储参与计算的指令信息,指令被解析为控制流来调度神经网络计算;权重存储单元105用于存储已经训练好的神经网络权重;The input data storage unit 102 is used to store the data involved in the calculation, which includes the original feature map data and the data involved in the calculation of the intermediate layer; the output data storage unit 104 stores the calculated neuron response value; the instruction storage unit 106 stores the data involved in the calculation Instruction information, the instruction is parsed into a control flow to schedule neural network calculations; the weight storage unit 105 is used to store trained neural network weights;

控制单元103分别与输出数据存储单元104、权重存储单元105、指令存储单元106、计算单元107相连,控制单元103获得保存在指令存储单元106中的指令并且解析该指令,控制单元103可根据解析指令得到的控制信号控制计算单元进行神经网络计算。The control unit 103 is connected to the output data storage unit 104, the weight storage unit 105, the instruction storage unit 106, and the calculation unit 107 respectively. The control unit 103 obtains the instruction stored in the instruction storage unit 106 and analyzes the instruction. The control unit 103 can analyze the instruction according to the The control signal obtained from the instruction controls the calculation unit to perform neural network calculation.

计算单元107用于根据控制单元103产生的控制信号来执行相应的神经网络计算。计算单元107与一个或多个存储单元相关联,计算单元107可以从与其相关联的输入数据存储单元102中的数据存储部件获得数据以进行计算,并且可以向与其相关联的输出数据存储单元104写入数据。计算单元107完成神经网络算法中的大部分运算,即向量乘加操作等。The calculation unit 107 is configured to perform corresponding neural network calculations according to the control signal generated by the control unit 103 . The calculation unit 107 is associated with one or more storage units, and the calculation unit 107 can obtain data from the data storage components in the input data storage unit 102 associated with it to perform calculations, and can send data to the output data storage unit 104 associated with it data input. The calculation unit 107 completes most of the operations in the neural network algorithm, that is, vector multiplication and addition operations and the like.

本发明通过提供所述神经网络描述文件格式描述神经网络模型特征,该描述文件内容包括基本属性、参数描述和连接信息三部分,其中基本属性包括层名称和层类型,参数描述包括,输出层数、卷积核大小和步进大小,连接信息包括连接名称、连接方向、连接类型。The present invention describes the characteristics of the neural network model by providing the neural network description file format. The content of the description file includes three parts: basic attributes, parameter descriptions, and connection information, wherein the basic attributes include layer names and layer types, and the parameter descriptions include the number of output layers. , convolution kernel size and step size, connection information includes connection name, connection direction, connection type.

为了适应各种神经网络模型的硬件实现设计,本发明提供的神经网络可复用单元库如图3,单元库包括硬件描述文件及配置脚本两部分。本发明提供的可复用单元库包括但不局限于:神经元单元、累加器单元、池化单元、分类器单元、局部响应归一化单元、查找表单元、地址生成单元、控制单元等。In order to adapt to the hardware implementation design of various neural network models, the neural network reusable unit library provided by the present invention is shown in Figure 3. The unit library includes two parts: a hardware description file and a configuration script. The reusable unit library provided by the present invention includes but not limited to: neuron unit, accumulator unit, pooling unit, classifier unit, local response normalization unit, lookup table unit, address generation unit, control unit, etc.

本发明在利用上述可复用单元库构成神经网络处理器系统时,通过读取神经网络模型描述文件及硬件资源约束合理优化调用单元库。When the present invention utilizes the reusable unit library to form a neural network processor system, it rationally optimizes and calls the unit library by reading the neural network model description file and hardware resource constraints.

在神经网络处理器工作过程中,处理器需要自动获取片上和片外存储器数据的地址流,在本发明中,存储器地址流由编译器确定生成,由存储器地址流确定的存储器访问模式通过文本交互至硬件生成器,存储器访问模式包括主访问模式、数据访问模式和权重访问模式等。During the working process of the neural network processor, the processor needs to automatically obtain the address stream of on-chip and off-chip memory data. In the present invention, the memory address stream is determined and generated by the compiler, and the memory access mode determined by the memory address stream is passed through text interaction. To the hardware generator, the memory access mode includes master access mode, data access mode and weight access mode, etc.

硬件生成器依据所述存储器访问模式地址生成单元(AGU)。Hardware generators address generation unit (AGU) according to the memory access pattern.

利用本发明提供的神经网络处理器自动设计工具设计的神经网络处理器电路包括三种类型的地址生成单元,包括:主地址生成单元、数据地址生成单元和权重地址生成单元,其中,主地址生成单元负责片内存储器与片外存储器之间的数据交换,数据地址生成单元负责从片上存储器读取数据至计算单元以及将计算单元中间计算结果和最终计算结果存储至存储单元这两部分数据交换,权重地址生成单元负责从片上存储器读取权重数据至计算单元。The neural network processor circuit designed by the neural network processor automatic design tool provided by the present invention includes three types of address generation units, including: a main address generation unit, a data address generation unit and a weight address generation unit, wherein the main address generation unit The unit is responsible for the data exchange between the on-chip memory and the off-chip memory. The data address generation unit is responsible for reading data from the on-chip memory to the computing unit and storing the intermediate calculation results and final calculation results of the computing unit to the storage unit. The weight address generating unit is responsible for reading weight data from the on-chip memory to the computing unit.

在本发明中,硬件电路生成器与编译器协同工作实现地址生成电路的设计,具体设计算法步骤为:In the present invention, the hardware circuit generator and the compiler work together to realize the design of the address generation circuit, and the specific design algorithm steps are:

步骤1,本发明装置根据设计人员指定的神经网络模型和硬件约束确定数据路径,并依据神经网络中间层特征确定数据资源共享方式;Step 1, the device of the present invention determines the data path according to the neural network model and hardware constraints specified by the designer, and determines the data resource sharing mode according to the characteristics of the middle layer of the neural network;

步骤2,编译器根据硬件配置和网络特征生成存储器地址访问流,所述地址访问流由编译器通过有限状态机的方式描述;Step 2, the compiler generates a memory address access flow according to the hardware configuration and network characteristics, and the address access flow is described by the compiler through a finite state machine;

步骤3,所述有限状态机由硬件生成器映射为地址生成电路硬件描述语言,进而通过硬件电路设计方法映射为硬件电路。Step 3, the finite state machine is mapped into an address generation circuit hardware description language by a hardware generator, and then mapped into a hardware circuit through a hardware circuit design method.

图4为本发明提供的地址生成电路通用结构示意图。本发明所述地址生成电路具有通用信号接口,该接口包含的接口信号有:FIG. 4 is a schematic diagram of the general structure of the address generation circuit provided by the present invention. The address generation circuit of the present invention has a general signal interface, and the interface signals included in the interface are:

起始地址信号,即数据首地址;The starting address signal, that is, the first address of the data;

数据块尺寸信号,取一次数据的数据量;Data block size signal, the amount of data to fetch once;

存储器标志位信号,确定将数据存放的存储器编号;The memory flag bit signal determines the number of the memory where the data is stored;

工作模式信号,分为大卷积核取数据模式、小卷积核取数据模式、池化模式、全卷积模式等;The working mode signal is divided into large convolution kernel data acquisition mode, small convolution kernel data acquisition mode, pooling mode, full convolution mode, etc.;

卷积核尺寸信号,定义卷积核大小;The convolution kernel size signal defines the size of the convolution kernel;

长度信号,定义输出图片大小;The length signal defines the size of the output image;

输入层数目信号,标记输入层数目;input layer number signal, marking the number of input layers;

输出层数目信号,标记输出层数目;Output layer number signal, marking the number of output layers;

复位信号,该信号为1时,初始化地址生成电路;Reset signal, when the signal is 1, initialize the address generation circuit;

写使能信号,指定被访问存储器进行写操作;Write enable signal, specifying the accessed memory for write operation;

读使能信号,指定被访问存储器进行读操作;Read enable signal, specifying the accessed memory for read operation;

地址信号,给出访问存储器地址;Address signal, giving the access memory address;

结束信号,访问结束信号。End signal, access end signal.

所述参数确保AGU支持多种工作模式并保证在不同工作模式及神经网络传播过程中能够生成正确的读写地址流。The parameters ensure that the AGU supports multiple working modes and ensures that correct read and write address streams can be generated during different working modes and neural network propagation.

针对不同的目标网络,工具从所述模板中选取必要的参数构建地址发生器并提供片上及片外存储器访问模式。For different target networks, the tool selects the necessary parameters from the template to build an address generator and provides on-chip and off-chip memory access modes.

本发明提供的神经网路处理器使用数据驱动的方式构建处理器架构,因此所述地址生成电路不仅提供访问地址而且驱动不同神经层和及层数据块的执行。The neural network processor provided by the present invention constructs a processor architecture in a data-driven manner, so the address generation circuit not only provides access addresses but also drives the execution of different neural layers and layer data blocks.

由于资源约束的限制,神经网络模型在映射为硬件电路时无法按照其模型描述形式完整展开,因此本发明提出的自动设计工具采用软硬件协同工作的方式优化数据存储及访问机制,包括两部分内容:首先,编译器分析神经网络处理器的计算吞吐量和片上存储器大小,将神经网络特征数据和权重数据划分为适当的数据块集中存储和访问;其次,依据计算单元规模、存储器及数据位宽在数据块内进行数据分割。Due to resource constraints, when the neural network model is mapped to a hardware circuit, it cannot be completely expanded according to its model description form. Therefore, the automatic design tool proposed by the present invention optimizes the data storage and access mechanism by using software and hardware to work together, including two parts. : First, the compiler analyzes the computing throughput and on-chip memory size of the neural network processor, divides the neural network feature data and weight data into appropriate data blocks for centralized storage and access; secondly, according to the calculation unit scale, memory and data bit width Data partitioning is performed within a data block.

本发明基于上述优化机制提出一种数据存储及访问的优化方法,具体实施步骤为:The present invention proposes an optimization method for data storage and access based on the above-mentioned optimization mechanism, and the specific implementation steps are:

步骤1,定义卷积核大小为k*k,步进为s,存储器宽度为d,数据图张数为t,如果k^2=d^2,将数据划分为k*k大小的数据块,数据宽度和存储器宽度一致,保证数据在存储器中连续存储;Step 1, define the convolution kernel size as k*k, stepping as s, memory width as d, number of data images as t, if k^2=d^2, divide the data into k*k size data blocks , the data width is consistent with the memory width, ensuring that the data is stored continuously in the memory;

步骤2,如果k^2!=d^2,并且s是k和d的最大公约数,将数据划分为s*s大小的数据块,保证在一张数据图中数据可在存储器中连续存储;Step 2, if k^2! =d^2, and s is the greatest common divisor of k and d, divide the data into data blocks of s*s size, and ensure that the data in a data map can be continuously stored in the memory;

步骤3,若以上两项都不满足,则求出s、k、d的最大公约数f,将数据分割为大小为f*f的数据块,t张数据图交替存储。Step 3, if the above two items are not satisfied, find the greatest common divisor f of s, k, and d, divide the data into data blocks with a size of f*f, and store t data graphs alternately.

神经网络的计算数据包括输入特征数据和训练好的权重数据,通过良好的数据存储布局可以减小处理器内部数据带宽并提高存储空间利用效率。本发明提供的自动设计工具通过增加处理器数据存储局部性提高处理器的计算效率。The calculation data of the neural network includes input feature data and trained weight data. Through a good data storage layout, the internal data bandwidth of the processor can be reduced and the storage space utilization efficiency can be improved. The automatic design tool provided by the invention improves the calculation efficiency of the processor by increasing the processor data storage locality.

综上所述,本发明提供一款面向神经网络处理器的自动化设计工具,该工具具有从神经网络模型映射为描述神经网络处理器的硬件代码、依据硬件资源约束优化处理器架构和自动生成控制流指令等功能,实现了神经网络处理器的自动化设计,降低了神经网络处理器的设计周期,适应了神经网络技术网络模型更新快、运算速度要求块、能量效率要求高的应用特点。In summary, the present invention provides an automatic design tool for neural network processors, which has the functions of mapping from neural network models to describe the hardware codes of neural network processors, optimizing processor architecture according to hardware resource constraints, and automatically generating control Streaming instructions and other functions realize the automatic design of neural network processors, reduce the design cycle of neural network processors, and adapt to the application characteristics of neural network technology with fast update of network models, high computing speed requirements, and high energy efficiency requirements.

应当理解,虽然本说明书是按照各个实施例描述的,但并非每个实施例仅包含一个独立的技术方案,说明书的这种叙述方式仅仅是为清楚起见,本领域技术人员应当将说明书作为一个整体,各实施例中的技术方案也可以经适当组合,形成本领域技术人员可以理解的其他实施方式。It should be understood that although this description is described according to various embodiments, not each embodiment only includes an independent technical solution, and this description of the description is only for clarity, and those skilled in the art should take the description as a whole , the technical solutions in the various embodiments can also be properly combined to form other implementations that can be understood by those skilled in the art.

本发明还提出一种面向神经网络处理器的自动化设计装置,包括:The present invention also proposes an automatic design device for neural network processors, including:

获取数据模块,用于获取神经网络模型描述文件、硬件资源约束参数,其中所述硬件资源约束参数包括硬件资源大小及目标运行速度;Obtaining a data module, configured to obtain a neural network model description file and hardware resource constraint parameters, wherein the hardware resource constraint parameters include hardware resource size and target operating speed;

生成硬件描述语言代码模块,用于根据所述神经网络模型描述文件与所述硬件资源约束参数,从已构建的神经网络组件库中查找单元库,并根据所述单元库生成对应于所述神经网络模型的神经网络处理器的硬件描述语言代码;Generate a hardware description language code module, which is used to search for a unit library from the constructed neural network component library according to the neural network model description file and the hardware resource constraint parameters, and generate a module corresponding to the neural network according to the unit library. The hardware description language code of the neural network processor of the network model;

生成硬件电路模块,用于将所述硬件描述语言代码转化为所述神经网络处理器的硬件电路。A hardware circuit module is generated for converting the hardware description language code into a hardware circuit of the neural network processor.

所述神经网络处理器包括存储结构、控制结构、计算结构。The neural network processor includes a storage structure, a control structure, and a calculation structure.

所述神经网络模型描述文件包括基本属性、参数描述与连接信息三部分,其中所述基本属性包括层名称与层类型,所述参数描述包括输出层数、卷积核大小与步进大小,所述连接信息包括连接名称、连接方向、连接类型。The neural network model description file includes three parts: basic attributes, parameter descriptions, and connection information, wherein the basic attributes include layer names and layer types, and the parameter descriptions include the number of output layers, convolution kernel size, and step size. The above connection information includes connection name, connection direction and connection type.

所述神经网络处理器包括主地址生成单元、数据地址生成单元与权重地址生成单元。The neural network processor includes a main address generation unit, a data address generation unit and a weight address generation unit.

还包括根据用户指定的神经网络模型与硬件资源约束参数确定数据路径,并根据神经网络中间层特征确定数据资源共享方式;It also includes determining the data path according to the user-specified neural network model and hardware resource constraint parameters, and determining the data resource sharing method according to the characteristics of the middle layer of the neural network;

根据硬件配置与网络特征生成存储器的地址访问流,所述地址访问流通过有限状态机的方式描述;Generate an address access flow of the memory according to the hardware configuration and network characteristics, and the address access flow is described by means of a finite state machine;

所述神经网络可复用单元库包括硬件描述文件及配置脚本两部分。The neural network reusable unit library includes two parts: a hardware description file and a configuration script.

所述神经网络可复用单元库包括神经元单元、累加器单元、池化单元、分类器单元、局部响应归一化单元、查找表单元、地址生成单元、控制单元。The neural network reusable unit library includes a neuron unit, an accumulator unit, a pooling unit, a classifier unit, a local response normalization unit, a lookup table unit, an address generation unit, and a control unit.

将所述有限状态机映射为地址,并生成硬件描述语言代码,进而转化为所述神经网络处理器的硬件电路。The finite state machine is mapped to an address, and a hardware description language code is generated, and then converted into a hardware circuit of the neural network processor.

还包括根据所述神经网络模型、所述硬件资源约束参数、所述硬件描述语言代码,生成数据存储映射与控制指令流。It also includes generating a data storage map and a control instruction flow according to the neural network model, the hardware resource constraint parameters, and the hardware description language code.

以上所述仅为本发明示意性的具体实施方式,并非用以限定本发明的范围。任何本领域的技术人员,在不脱离本发明的构思和原则的前提下所作的等同变化、修改与结合,均应属于本发明保护的范围。The above descriptions are only illustrative specific implementations of the present invention, and are not intended to limit the scope of the present invention. Any equivalent changes, modifications and combinations made by those skilled in the art without departing from the concept and principle of the present invention shall fall within the protection scope of the present invention.

Claims (17)

1.一种面向神经网络处理器的自动化设计方法,其特征在于,包括:1. an automated design method for neural network processors, characterized in that, comprising: 步骤1,获取神经网络模型描述文件、硬件资源约束参数,其中所述硬件资源约束参数包括硬件资源大小及目标运行速度;Step 1, obtaining the neural network model description file and hardware resource constraint parameters, wherein the hardware resource constraint parameters include hardware resource size and target operating speed; 步骤2,根据所述神经网络模型描述文件与所述硬件资源约束参数,从已构建的神经网络组件库中查找单元库,并根据所述单元库生成对应于所述神经网络模型的神经网络处理器的硬件描述语言代码;Step 2, according to the neural network model description file and the hardware resource constraint parameters, search the unit library from the built neural network component library, and generate the neural network processing corresponding to the neural network model according to the unit library The hardware description language code of the device; 步骤3,将所述硬件描述语言代码转化为所述神经网络处理器的硬件电路。Step 3, converting the hardware description language code into the hardware circuit of the neural network processor. 2.如权利要求1所述的一种面向神经网络处理器的自动化设计方法,其特征在于,所述神经网络处理器包括存储结构、控制结构、计算结构。2. a kind of neural network processor-oriented automatic design method as claimed in claim 1, is characterized in that, described neural network processor comprises storage structure, control structure, computing structure. 3.如权利要求1所述的一种面向神经网络处理器的自动化设计方法,其特征在于,所述神经网络模型描述文件包括基本属性、参数描述与连接信息三部分,其中所述基本属性包括层名称与层类型,所述参数描述包括输出层数、卷积核大小与步进大小,所述连接信息包括连接名称、连接方向、连接类型。3. a kind of neural network processor-oriented automatic design method as claimed in claim 1, is characterized in that, described neural network model description file comprises three parts of basic attribute, parameter description and connection information, wherein said basic attribute includes Layer name and layer type, the parameter description includes the number of output layers, convolution kernel size and step size, and the connection information includes connection name, connection direction, and connection type. 4.如权利要求1所述的一种面向神经网络处理器的自动化设计方法,其特征在于,所述神经网络可复用单元库包括硬件描述文件及配置脚本两部分。4. A neural network processor-oriented automatic design method according to claim 1, wherein the neural network reusable unit library includes two parts, a hardware description file and a configuration script. 5.如权利要求1所述的一种面向神经网络处理器的自动化设计方法,其特征在于,所述神经网络可复用单元库包括神经元单元、累加器单元、池化单元、分类器单元、局部响应归一化单元、查找表单元、地址生成单元、控制单元。5. a kind of automatic design method facing neural network processor as claimed in claim 1, is characterized in that, described neural network reusable unit storehouse comprises neuron unit, accumulator unit, pooling unit, classifier unit , a partial response normalization unit, a lookup table unit, an address generation unit, and a control unit. 6.如权利要求1所述的一种面向神经网络处理器的自动化设计方法,其特征在于,所述神经网络处理器包括主地址生成单元、数据地址生成单元与权重地址生成单元。6. A neural network processor-oriented automatic design method according to claim 1, characterized in that said neural network processor comprises a main address generation unit, a data address generation unit and a weight address generation unit. 7.如权利要求1所述的一种面向神经网络处理器的自动化设计方法,其特征在于,还包括根据用户指定的神经网络模型与硬件资源约束参数确定数据路径,并根据神经网络中间层特征确定数据资源共享方式;7. A kind of neural network processor-oriented automatic design method as claimed in claim 1, is characterized in that, also comprises determining the data path according to the neural network model specified by the user and the hardware resource constraint parameter, and according to the neural network middle layer characteristic Determine the data resource sharing method; 根据硬件配置与网络特征生成存储器的地址访问流,所述地址访问流通过有限状态机的方式描述;Generate an address access flow of the memory according to the hardware configuration and network characteristics, and the address access flow is described by means of a finite state machine; 生成硬件描述语言代码,进而转化为所述神经网络处理器的硬件电路。Generate a hardware description language code, and then convert it into a hardware circuit of the neural network processor. 8.如权利要求1所述的一种面向神经网络处理器的自动化设计方法,其特征在于,还包括根据所述神经网络模型、所述硬件资源约束参数、所述硬件描述语言代码,生成数据存储映射与控制指令流。8. a kind of neural network processor-oriented automatic design method as claimed in claim 1, is characterized in that, also comprises according to described neural network model, described hardware resource constraint parameter, described hardware description language code, generates data Memory mapping and control instruction flow. 9.一种面向神经网络处理器的自动化设计装置,其特征在于,包括:9. An automatic design device for neural network processors, characterized in that it comprises: 获取数据模块,用于获取神经网络模型描述文件、硬件资源约束参数,其中所述硬件资源约束参数包括硬件资源大小及目标运行速度;Obtaining a data module, configured to obtain a neural network model description file and hardware resource constraint parameters, wherein the hardware resource constraint parameters include hardware resource size and target operating speed; 生成硬件描述语言代码模块,用于根据所述神经网络模型描述文件与所述硬件资源约束参数,从已构建的神经网络组件库中查找单元库,并根据所述单元库生成对应于所述神经网络模型的神经网络处理器的硬件描述语言代码;Generate a hardware description language code module, which is used to search for a unit library from the constructed neural network component library according to the neural network model description file and the hardware resource constraint parameters, and generate a module corresponding to the neural network according to the unit library. The hardware description language code of the neural network processor of the network model; 生成硬件电路模块,用于将所述硬件描述语言代码转化为所述神经网络处理器的硬件电路。A hardware circuit module is generated for converting the hardware description language code into a hardware circuit of the neural network processor. 10.如权利要求9所述的一种面向神经网络处理器的自动化设计装置,其特征在于,所述神经网络处理器包括存储结构、控制结构、计算结构。10 . The neural network processor-oriented automatic design device according to claim 9 , wherein the neural network processor comprises a storage structure, a control structure, and a calculation structure. 11 . 11.如权利要求9所述的一种面向神经网络处理器的自动化设计装置,其特征在于,所述神经网络模型描述文件包括基本属性、参数描述与连接信息三部分,其中所述基本属性包括层名称与层类型,所述参数描述包括输出层数、卷积核大小与步进大小,所述连接信息包括连接名称、连接方向、连接类型。11. A neural network processor-oriented automatic design device according to claim 9, wherein the neural network model description file includes three parts: basic attributes, parameter descriptions and connection information, wherein the basic attributes include Layer name and layer type, the parameter description includes the number of output layers, convolution kernel size and step size, and the connection information includes connection name, connection direction, and connection type. 12.如权利要求9所述的一种面向神经网络处理器的自动化设计装置,其特征在于,所述神经网络可复用单元库包括硬件描述文件及配置脚本两部分。12. The neural network processor-oriented automatic design device according to claim 9, wherein the neural network reusable unit library includes two parts: a hardware description file and a configuration script. 13.如权利要求9所述的一种面向神经网络处理器的自动化设计装置,其特征在于,所述神经网络可复用单元库包括神经元单元、累加器单元、池化单元、分类器单元、局部响应归一化单元、查找表单元、地址生成单元、控制单元。13. A kind of automatic design device for neural network processor as claimed in claim 9, is characterized in that, described neural network reusable unit storehouse comprises neuron unit, accumulator unit, pooling unit, classifier unit , a partial response normalization unit, a lookup table unit, an address generation unit, and a control unit. 14.如权利要求9所述的一种面向神经网络处理器的自动化设计装置,其特征在于,所述神经网络处理器包括主地址生成单元、数据地址生成单元与权重地址生成单元。14 . The neural network processor-oriented automatic design device according to claim 9 , wherein the neural network processor comprises a main address generation unit, a data address generation unit and a weight address generation unit. 15 . 15.如权利要求9所述的一种面向神经网络处理器的自动化设计装置,其特征在于,还包括根据用户指定的神经网络模型与硬件资源约束参数确定数据路径,并根据神经网络中间层特征确定数据资源共享方式;15. A neural network processor-oriented automatic design device as claimed in claim 9, further comprising determining the data path according to the user-specified neural network model and hardware resource constraint parameters, and determining the data path according to the characteristics of the neural network middle layer Determine the data resource sharing method; 根据硬件配置与网络特征生成存储器的地址访问流,所述地址访问流通过有限状态机的方式描述;Generate an address access flow of the memory according to the hardware configuration and network characteristics, and the address access flow is described by means of a finite state machine; 生成硬件描述语言代码,进而转化为所述神经网络处理器的硬件电路。Generate a hardware description language code, and then convert it into a hardware circuit of the neural network processor. 16.如权利要求9所述的一种面向神经网络处理器的自动化设计装置,其特征在于,还包括根据所述神经网络模型、所述硬件资源约束参数、所述硬件描述语言代码,生成数据存储映射与控制指令流。16. A kind of neural network processor-oriented automatic design device as claimed in claim 9, is characterized in that, also comprises according to described neural network model, described hardware resource constraint parameter, described hardware description language code, generates data Memory mapping and control instruction flow. 17.一种基于如权利要求1-8任意一项所述的一种面向神经网络处理器的自动化设计方法的优化方法,其特征在于,包括:17. A kind of optimization method based on a kind of automatic design method for neural network processor as described in any one of claim 1-8, it is characterized in that, comprising: 步骤1,定义卷积核大小为k*k,步进为s,存储器宽度为d,数据图张数为t,如果k^2=d^2,将数据划分为k*k大小的数据块,数据宽度与存储器宽度一致,保证数据在存储器中连续存储;Step 1, define the convolution kernel size as k*k, stepping as s, memory width as d, number of data images as t, if k^2=d^2, divide the data into k*k size data blocks , the data width is consistent with the memory width, ensuring that the data is stored continuously in the memory; 步骤2,如果k^2!=d^2,并且步进s是k与存储器宽度d的最大公约数,将数据划分为s*s大小的数据块,保证在一张数据图中数据在存储器中连续存储;Step 2, if k^2! =d^2, and stepping s is the greatest common divisor of k and memory width d, divide the data into data blocks of s*s size, and ensure that the data in a data map is continuously stored in the memory; 步骤3,若以上两项都不满足,则求出步进s、k、存储器宽度d的最大公约数f,将数据分割为大小为f*f的数据块,t张数据图交替存储。Step 3, if the above two items are not satisfied, find the greatest common divisor f of step s, k, and memory width d, divide the data into data blocks of size f*f, and store t data maps alternately.
CN201710178281.3A 2017-03-23 2017-03-23 The Automation Design method, apparatus and optimization method towards neural network processor Active CN107103113B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201710178281.3A CN107103113B (en) 2017-03-23 2017-03-23 The Automation Design method, apparatus and optimization method towards neural network processor
PCT/CN2018/080207 WO2018171717A1 (en) 2017-03-23 2018-03-23 Automated design method and system for neural network processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710178281.3A CN107103113B (en) 2017-03-23 2017-03-23 The Automation Design method, apparatus and optimization method towards neural network processor

Publications (2)

Publication Number Publication Date
CN107103113A true CN107103113A (en) 2017-08-29
CN107103113B CN107103113B (en) 2019-01-11

Family

ID=59676152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710178281.3A Active CN107103113B (en) 2017-03-23 2017-03-23 The Automation Design method, apparatus and optimization method towards neural network processor

Country Status (2)

Country Link
CN (1) CN107103113B (en)
WO (1) WO2018171717A1 (en)

Cited By (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107341761A (en) * 2017-07-12 2017-11-10 成都品果科技有限公司 A kind of calculating of deep neural network performs method and system
CN107633295A (en) * 2017-09-25 2018-01-26 北京地平线信息技术有限公司 For the method and apparatus for the parameter for being adapted to neutral net
CN108154229A (en) * 2018-01-10 2018-06-12 西安电子科技大学 Accelerate the image processing method of convolutional neural networks frame based on FPGA
CN108388943A (en) * 2018-01-08 2018-08-10 中国科学院计算技术研究所 A kind of pond device and method suitable for neural network
CN108389183A (en) * 2018-01-24 2018-08-10 上海交通大学 Pulmonary nodule detects neural network accelerator and its control method
CN108563808A (en) * 2018-01-05 2018-09-21 中国科学技术大学 The design method of heterogeneous reconfigurable figure computation accelerator system based on FPGA
WO2018171717A1 (en) * 2017-03-23 2018-09-27 中国科学院计算技术研究所 Automated design method and system for neural network processor
CN108921289A (en) * 2018-06-20 2018-11-30 郑州云海信息技术有限公司 A kind of FPGA isomery accelerated method, apparatus and system
CN109685203A (en) * 2018-12-21 2019-04-26 北京中科寒武纪科技有限公司 Data processing method, device, computer system and storage medium
CN109697509A (en) * 2017-10-24 2019-04-30 上海寒武纪信息科技有限公司 Processing method and processing device, operation method and device
CN109726805A (en) * 2017-10-30 2019-05-07 上海寒武纪信息科技有限公司 The method for carrying out neural network processor design using black box simulator
CN109726797A (en) * 2018-12-21 2019-05-07 北京中科寒武纪科技有限公司 Data processing method, device, computer system and storage medium
CN109739802A (en) * 2019-04-01 2019-05-10 上海燧原智能科技有限公司 Computing cluster and computing cluster configuration method
CN109754084A (en) * 2018-12-29 2019-05-14 北京中科寒武纪科技有限公司 Processing method, device and the Related product of network structure
CN109754073A (en) * 2018-12-29 2019-05-14 北京中科寒武纪科技有限公司 Data processing method, device, electronic equipment and readable storage medium storing program for executing
CN109978160A (en) * 2019-03-25 2019-07-05 北京中科寒武纪科技有限公司 Configuration device, method and the Related product of artificial intelligence process device
CN109993288A (en) * 2017-12-29 2019-07-09 北京中科寒武纪科技有限公司 Processing with Neural Network method, computer system and storage medium
CN110097179A (en) * 2018-01-29 2019-08-06 上海寒武纪信息科技有限公司 Computer equipment, data processing method and storage medium
CN110097180A (en) * 2018-01-29 2019-08-06 上海寒武纪信息科技有限公司 Computer equipment, data processing method and storage medium
CN110955380A (en) * 2018-09-21 2020-04-03 中科寒武纪科技股份有限公司 Access data generation method, storage medium, computer device and apparatus
WO2020078446A1 (en) * 2018-10-19 2020-04-23 中科寒武纪科技股份有限公司 Computation method and apparatus, and related product
CN111079909A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111079914A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111079924A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111078293A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, device and related product
CN111079912A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111079910A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, device and related product
CN111079907A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, device and related product
CN111079911A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111079916A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111126572A (en) * 2019-12-26 2020-05-08 北京奇艺世纪科技有限公司 Model parameter processing method and device, electronic equipment and storage medium
CN111144561A (en) * 2018-11-05 2020-05-12 杭州海康威视数字技术股份有限公司 Neural network model determining method and device
WO2020093885A1 (en) * 2018-11-09 2020-05-14 北京灵汐科技有限公司 Heterogeneous collaborative computing system
CN111325311A (en) * 2018-12-14 2020-06-23 深圳云天励飞技术有限公司 Neural network model generation method, device, electronic device and storage medium
CN111339027A (en) * 2020-02-25 2020-06-26 中国科学院苏州纳米技术与纳米仿生研究所 Automatic design method of reconfigurable artificial intelligence core and heterogeneous multi-core chip
CN111488969A (en) * 2020-04-03 2020-08-04 北京思朗科技有限责任公司 Execution optimization method and device based on neural network accelerator
KR20200100528A (en) * 2017-12-29 2020-08-26 캠브리콘 테크놀로지스 코퍼레이션 리미티드 Neural network processing method, computer system and storage medium
CN111868754A (en) * 2018-03-23 2020-10-30 索尼公司 Information processing apparatus and information processing method
CN111931926A (en) * 2020-10-12 2020-11-13 南京风兴科技有限公司 Hardware acceleration system and control method for convolutional neural network CNN
CN111949405A (en) * 2020-08-13 2020-11-17 Oppo广东移动通信有限公司 Resource scheduling method, hardware accelerator and electronic device
CN112052943A (en) * 2019-06-05 2020-12-08 三星电子株式会社 Electronic device and method for performing operation of the same
CN112132271A (en) * 2019-06-25 2020-12-25 Oppo广东移动通信有限公司 Neural network accelerator operation method, architecture and related device
CN112912837A (en) * 2018-11-08 2021-06-04 北京比特大陆科技有限公司 Neural network compiling method, device, equipment, storage medium and program product
US11113104B2 (en) 2017-11-20 2021-09-07 Shanghai Cambricon Information Technology Co., Ltd Task parallel processing method, apparatus and system, storage medium and computer device
US11403069B2 (en) 2017-07-24 2022-08-02 Tesla, Inc. Accelerated mathematical engine
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US11487288B2 (en) 2017-03-23 2022-11-01 Tesla, Inc. Data synthesis for autonomous control systems
US11521046B2 (en) 2017-11-08 2022-12-06 Samsung Electronics Co., Ltd. Time-delayed convolutions for neural network device and method
CN115462079A (en) * 2019-08-13 2022-12-09 深圳鲲云信息科技有限公司 Neural network data stream acceleration method and device, computer equipment and storage medium
US11537811B2 (en) 2018-12-04 2022-12-27 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11562231B2 (en) 2018-09-03 2023-01-24 Tesla, Inc. Neural networks for embedded devices
US11561791B2 (en) 2018-02-01 2023-01-24 Tesla, Inc. Vector computational unit receiving data elements in parallel from a last row of a computational array
US11567514B2 (en) 2019-02-11 2023-01-31 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
US11610117B2 (en) 2018-12-27 2023-03-21 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
US11636333B2 (en) 2018-07-26 2023-04-25 Tesla, Inc. Optimizing neural network structures for embedded systems
US11665108B2 (en) 2018-10-25 2023-05-30 Tesla, Inc. QoS manager for system on a chip communications
US11681899B2 (en) 2018-12-07 2023-06-20 Samsong Electronics Co., Ltd. Dividing neural networks
US11681649B2 (en) 2017-07-24 2023-06-20 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
US11734562B2 (en) 2018-06-20 2023-08-22 Tesla, Inc. Data pipeline and deep learning system for autonomous driving
US11748620B2 (en) 2019-02-01 2023-09-05 Tesla, Inc. Generating ground truth for machine learning from time series elements
US12014553B2 (en) 2019-02-01 2024-06-18 Tesla, Inc. Predicting three-dimensional features for autonomous driving

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11893393B2 (en) 2017-07-24 2024-02-06 Tesla, Inc. Computational array microprocessor system with hardware arbiter managing memory requests
US11361457B2 (en) 2018-07-20 2022-06-14 Tesla, Inc. Annotation cross-labeling for autonomous control systems
JP7539872B2 (en) 2018-10-11 2024-08-26 テスラ,インコーポレイテッド SYSTEM AND METHOD FOR TRAINING MACHINE MODELS WITH AUGMENTED DATA - Patent application
US11816585B2 (en) 2018-12-03 2023-11-14 Tesla, Inc. Machine learning models operating at different frequencies for autonomous vehicles
US10956755B2 (en) 2019-02-19 2021-03-23 Tesla, Inc. Estimating object properties using visual image data
US12112112B2 (en) 2020-11-12 2024-10-08 Samsung Electronics Co., Ltd. Method for co-design of hardware and neural network architectures using coarse-to-fine search, two-phased block distillation and neural hardware predictor
JP2023032348A (en) * 2021-08-26 2023-03-09 国立大学法人 東京大学 Information processing device, and program

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106022468A (en) * 2016-05-17 2016-10-12 成都启英泰伦科技有限公司 Artificial neural network processor integrated circuit and design method therefor
WO2016179533A1 (en) * 2015-05-06 2016-11-10 Indiana University Research And Technology Corporation Sensor signal processing using an analog neural network
CN106447034A (en) * 2016-10-27 2017-02-22 中国科学院计算技术研究所 Neutral network processor based on data compression, design method and chip
CN106529670A (en) * 2016-10-27 2017-03-22 中国科学院计算技术研究所 Neural network processor based on weight compression, design method, and chip

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107103113B (en) * 2017-03-23 2019-01-11 中国科学院计算技术研究所 The Automation Design method, apparatus and optimization method towards neural network processor

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016179533A1 (en) * 2015-05-06 2016-11-10 Indiana University Research And Technology Corporation Sensor signal processing using an analog neural network
CN106022468A (en) * 2016-05-17 2016-10-12 成都启英泰伦科技有限公司 Artificial neural network processor integrated circuit and design method therefor
CN106447034A (en) * 2016-10-27 2017-02-22 中国科学院计算技术研究所 Neutral network processor based on data compression, design method and chip
CN106529670A (en) * 2016-10-27 2017-03-22 中国科学院计算技术研究所 Neural network processor based on weight compression, design method, and chip

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YING WANG等: "DeepBurning: Automatic generation of FPGA-based learning accelerators for the Neural Network family", 《DESIGN AUTOMATION CONFERENCE》 *
叶莉娅等: "基于神经网络嵌入式系统体系结构的研究", 《杭州电子科技大学学报》 *

Cited By (93)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018171717A1 (en) * 2017-03-23 2018-09-27 中国科学院计算技术研究所 Automated design method and system for neural network processor
US11487288B2 (en) 2017-03-23 2022-11-01 Tesla, Inc. Data synthesis for autonomous control systems
US12020476B2 (en) 2017-03-23 2024-06-25 Tesla, Inc. Data synthesis for autonomous control systems
CN107341761A (en) * 2017-07-12 2017-11-10 成都品果科技有限公司 A kind of calculating of deep neural network performs method and system
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US11403069B2 (en) 2017-07-24 2022-08-02 Tesla, Inc. Accelerated mathematical engine
US12086097B2 (en) 2017-07-24 2024-09-10 Tesla, Inc. Vector computational unit
US11681649B2 (en) 2017-07-24 2023-06-20 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
CN107633295A (en) * 2017-09-25 2018-01-26 北京地平线信息技术有限公司 For the method and apparatus for the parameter for being adapted to neutral net
US11461632B2 (en) 2017-09-25 2022-10-04 Nanjing Horizon Robotics Technology Co., Ltd. Method and apparatus for adapting parameters of neural network
CN109697509B (en) * 2017-10-24 2020-10-20 上海寒武纪信息科技有限公司 Processing method and device, and operation method and device
CN109697509A (en) * 2017-10-24 2019-04-30 上海寒武纪信息科技有限公司 Processing method and processing device, operation method and device
CN109726805A (en) * 2017-10-30 2019-05-07 上海寒武纪信息科技有限公司 The method for carrying out neural network processor design using black box simulator
CN109726805B (en) * 2017-10-30 2021-02-09 上海寒武纪信息科技有限公司 Method for designing neural network processor by using black box simulator
US11521046B2 (en) 2017-11-08 2022-12-06 Samsung Electronics Co., Ltd. Time-delayed convolutions for neural network device and method
US11113104B2 (en) 2017-11-20 2021-09-07 Shanghai Cambricon Information Technology Co., Ltd Task parallel processing method, apparatus and system, storage medium and computer device
US11221877B2 (en) 2017-11-20 2022-01-11 Shanghai Cambricon Information Technology Co., Ltd Task parallel processing method, apparatus and system, storage medium and computer device
US11360811B2 (en) 2017-11-20 2022-06-14 Shanghai Cambricon Information Technology Co., Ltd Task parallel processing method, apparatus and system, storage medium and computer device
US11113103B2 (en) 2017-11-20 2021-09-07 Shanghai Cambricon Information Technology Co., Ltd Task parallel processing method, apparatus and system, storage medium and computer device
CN111582464B (en) * 2017-12-29 2023-09-29 中科寒武纪科技股份有限公司 Neural network processing method, computer system and storage medium
CN111582464A (en) * 2017-12-29 2020-08-25 中科寒武纪科技股份有限公司 Neural network processing method, computer system, and storage medium
KR20200100528A (en) * 2017-12-29 2020-08-26 캠브리콘 테크놀로지스 코퍼레이션 리미티드 Neural network processing method, computer system and storage medium
CN109993288A (en) * 2017-12-29 2019-07-09 北京中科寒武纪科技有限公司 Processing with Neural Network method, computer system and storage medium
KR102720330B1 (en) 2017-12-29 2024-10-22 캠브리콘 테크놀로지스 코퍼레이션 리미티드 Neural network processing method, computer system and storage medium
EP3629251A4 (en) * 2017-12-29 2020-11-25 Cambricon Technologies Corporation Limited PROCESSING METHODS FOR NEURONAL NETWORK, COMPUTER SYSTEM AND STORAGE MEDIUM
CN108563808A (en) * 2018-01-05 2018-09-21 中国科学技术大学 The design method of heterogeneous reconfigurable figure computation accelerator system based on FPGA
CN108563808B (en) * 2018-01-05 2020-12-04 中国科学技术大学 Design Method of Heterogeneous Reconfigurable Graph Computation Accelerator System Based on FPGA
CN108388943A (en) * 2018-01-08 2018-08-10 中国科学院计算技术研究所 A kind of pond device and method suitable for neural network
CN108388943B (en) * 2018-01-08 2020-12-29 中国科学院计算技术研究所 A pooling device and method suitable for neural networks
CN108154229B (en) * 2018-01-10 2022-04-08 西安电子科技大学 Image processing method based on FPGA accelerated convolutional neural network framework
CN108154229A (en) * 2018-01-10 2018-06-12 西安电子科技大学 Accelerate the image processing method of convolutional neural networks frame based on FPGA
CN108389183A (en) * 2018-01-24 2018-08-10 上海交通大学 Pulmonary nodule detects neural network accelerator and its control method
CN110097179B (en) * 2018-01-29 2020-03-10 上海寒武纪信息科技有限公司 Computer device, data processing method, and storage medium
CN110097179A (en) * 2018-01-29 2019-08-06 上海寒武纪信息科技有限公司 Computer equipment, data processing method and storage medium
CN110097180A (en) * 2018-01-29 2019-08-06 上海寒武纪信息科技有限公司 Computer equipment, data processing method and storage medium
US11561791B2 (en) 2018-02-01 2023-01-24 Tesla, Inc. Vector computational unit receiving data elements in parallel from a last row of a computational array
CN111868754A (en) * 2018-03-23 2020-10-30 索尼公司 Information processing apparatus and information processing method
CN108921289B (en) * 2018-06-20 2021-10-29 郑州云海信息技术有限公司 A kind of FPGA heterogeneous acceleration method, device and system
US11734562B2 (en) 2018-06-20 2023-08-22 Tesla, Inc. Data pipeline and deep learning system for autonomous driving
CN108921289A (en) * 2018-06-20 2018-11-30 郑州云海信息技术有限公司 A kind of FPGA isomery accelerated method, apparatus and system
US12079723B2 (en) 2018-07-26 2024-09-03 Tesla, Inc. Optimizing neural network structures for embedded systems
US11636333B2 (en) 2018-07-26 2023-04-25 Tesla, Inc. Optimizing neural network structures for embedded systems
US11562231B2 (en) 2018-09-03 2023-01-24 Tesla, Inc. Neural networks for embedded devices
US11983630B2 (en) 2018-09-03 2024-05-14 Tesla, Inc. Neural networks for embedded devices
CN110955380A (en) * 2018-09-21 2020-04-03 中科寒武纪科技股份有限公司 Access data generation method, storage medium, computer device and apparatus
CN111079914B (en) * 2018-10-19 2021-02-09 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111079907A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, device and related product
WO2020078446A1 (en) * 2018-10-19 2020-04-23 中科寒武纪科技股份有限公司 Computation method and apparatus, and related product
CN111079909A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111079914A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111079924A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111078293A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, device and related product
CN111078293B (en) * 2018-10-19 2021-03-16 中科寒武纪科技股份有限公司 Operation method, device and related product
CN111079912A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111079910A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, device and related product
CN111079911A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111079916A (en) * 2018-10-19 2020-04-28 中科寒武纪科技股份有限公司 Operation method, system and related product
US11665108B2 (en) 2018-10-25 2023-05-30 Tesla, Inc. QoS manager for system on a chip communications
CN111144561A (en) * 2018-11-05 2020-05-12 杭州海康威视数字技术股份有限公司 Neural network model determining method and device
CN111144561B (en) * 2018-11-05 2023-05-02 杭州海康威视数字技术股份有限公司 Neural network model determining method and device
CN112912837B (en) * 2018-11-08 2024-02-13 北京比特大陆科技有限公司 Neural network compiling method, device, equipment, storage medium and program product
CN112912837A (en) * 2018-11-08 2021-06-04 北京比特大陆科技有限公司 Neural network compiling method, device, equipment, storage medium and program product
WO2020093885A1 (en) * 2018-11-09 2020-05-14 北京灵汐科技有限公司 Heterogeneous collaborative computing system
US11908171B2 (en) 2018-12-04 2024-02-20 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11537811B2 (en) 2018-12-04 2022-12-27 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11681899B2 (en) 2018-12-07 2023-06-20 Samsong Electronics Co., Ltd. Dividing neural networks
CN111325311B (en) * 2018-12-14 2024-03-29 深圳云天励飞技术有限公司 Neural network model generation method for image recognition and related equipment
CN111325311A (en) * 2018-12-14 2020-06-23 深圳云天励飞技术有限公司 Neural network model generation method, device, electronic device and storage medium
CN109726797A (en) * 2018-12-21 2019-05-07 北京中科寒武纪科技有限公司 Data processing method, device, computer system and storage medium
CN109685203A (en) * 2018-12-21 2019-04-26 北京中科寒武纪科技有限公司 Data processing method, device, computer system and storage medium
US11610117B2 (en) 2018-12-27 2023-03-21 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
US12136030B2 (en) 2018-12-27 2024-11-05 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
CN109754073B (en) * 2018-12-29 2020-03-10 中科寒武纪科技股份有限公司 Data processing method and device, electronic equipment and readable storage medium
CN109754084A (en) * 2018-12-29 2019-05-14 北京中科寒武纪科技有限公司 Processing method, device and the Related product of network structure
CN109754073A (en) * 2018-12-29 2019-05-14 北京中科寒武纪科技有限公司 Data processing method, device, electronic equipment and readable storage medium storing program for executing
US11748620B2 (en) 2019-02-01 2023-09-05 Tesla, Inc. Generating ground truth for machine learning from time series elements
US12014553B2 (en) 2019-02-01 2024-06-18 Tesla, Inc. Predicting three-dimensional features for autonomous driving
US11567514B2 (en) 2019-02-11 2023-01-31 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
CN109978160A (en) * 2019-03-25 2019-07-05 北京中科寒武纪科技有限公司 Configuration device, method and the Related product of artificial intelligence process device
CN109739802A (en) * 2019-04-01 2019-05-10 上海燧原智能科技有限公司 Computing cluster and computing cluster configuration method
US11734577B2 (en) 2019-06-05 2023-08-22 Samsung Electronics Co., Ltd Electronic apparatus and method of performing operations thereof
CN112052943A (en) * 2019-06-05 2020-12-08 三星电子株式会社 Electronic device and method for performing operation of the same
WO2020246724A1 (en) * 2019-06-05 2020-12-10 Samsung Electronics Co., Ltd. Electronic apparatus and method of performing operations thereof
CN112132271A (en) * 2019-06-25 2020-12-25 Oppo广东移动通信有限公司 Neural network accelerator operation method, architecture and related device
CN115462079A (en) * 2019-08-13 2022-12-09 深圳鲲云信息科技有限公司 Neural network data stream acceleration method and device, computer equipment and storage medium
CN111126572B (en) * 2019-12-26 2023-12-08 北京奇艺世纪科技有限公司 Model parameter processing method and device, electronic equipment and storage medium
CN111126572A (en) * 2019-12-26 2020-05-08 北京奇艺世纪科技有限公司 Model parameter processing method and device, electronic equipment and storage medium
CN111339027A (en) * 2020-02-25 2020-06-26 中国科学院苏州纳米技术与纳米仿生研究所 Automatic design method of reconfigurable artificial intelligence core and heterogeneous multi-core chip
CN111339027B (en) * 2020-02-25 2023-11-28 中国科学院苏州纳米技术与纳米仿生研究所 Reconfigurable artificial intelligence core and automatic design method for heterogeneous multi-core chips
CN111488969B (en) * 2020-04-03 2024-01-19 北京集朗半导体科技有限公司 Execution optimization method and device based on neural network accelerator
CN111488969A (en) * 2020-04-03 2020-08-04 北京思朗科技有限责任公司 Execution optimization method and device based on neural network accelerator
CN111949405A (en) * 2020-08-13 2020-11-17 Oppo广东移动通信有限公司 Resource scheduling method, hardware accelerator and electronic device
CN111931926A (en) * 2020-10-12 2020-11-13 南京风兴科技有限公司 Hardware acceleration system and control method for convolutional neural network CNN

Also Published As

Publication number Publication date
CN107103113B (en) 2019-01-11
WO2018171717A1 (en) 2018-09-27

Similar Documents

Publication Publication Date Title
CN107103113A (en) Towards the Automation Design method, device and the optimization method of neural network processor
CN107016175B (en) It is applicable in the Automation Design method, apparatus and optimization method of neural network processor
EP3884435A1 (en) System and method for automated precision configuration for deep neural networks
CN112070202B (en) Fusion graph generation method and device and computer readable storage medium
CN114035916B (en) Compilation and scheduling methods of computational graphs and related products
CN108932135A (en) The acceleration platform designing method of sorting algorithm based on FPGA
CN111563582A (en) A method for implementing and optimizing accelerated convolutional neural network on FPGA
JP6503072B2 (en) Semiconductor system and calculation method
CN116126341A (en) Model compiling method, device, computer equipment and computer readable storage medium
Xu et al. FCLNN: A flexible framework for fast CNN prototyping on FPGA with OpenCL and caffe
CN115345285B (en) GPU-based timing chart neural network training method and system and electronic equipment
CN114968362B (en) Heterogeneous fusion computing instruction set and method of use
CN105700933A (en) Parallelization and loop optimization method and system for a high-level language of reconfigurable processor
CN104239630B (en) A kind of emulation dispatch system of supportive test design
CN111667060B (en) Deep learning algorithm compiling method and device and related products
WO2023030507A1 (en) Compilation optimization method and apparatus, computer device and storage medium
CN116402091A (en) Hybrid engine intelligent computing method and device for artificial intelligent chip
Ali et al. RISC-V based MPSoC design exploration for FPGAs: area, power and performance
Odetola et al. 2l-3w: 2-level 3-way hardware–software co-verification for the mapping of convolutional neural network (cnn) onto fpga boards
CN114127681B (en) Method and apparatus for autonomous acceleration of data stream AI applications
CN105893660B (en) A kind of CPU design method and computing system towards symbol BDD operations
CN115858092A (en) Time sequence simulation method, device and system
CN111143208B (en) Verification method for assisting FPGA to realize AI algorithm based on processor technology
CN114691457A (en) A method, apparatus, storage medium and electronic device for determining hardware performance
Kuga et al. Streaming Accelerator Design for Regular Expression on CPU+ FPGA Embedded System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant