[go: up one dir, main page]

CN112445538B - Configuration loading system and method for reconfigurable processor - Google Patents

Configuration loading system and method for reconfigurable processor Download PDF

Info

Publication number
CN112445538B
CN112445538B CN202011472218.9A CN202011472218A CN112445538B CN 112445538 B CN112445538 B CN 112445538B CN 202011472218 A CN202011472218 A CN 202011472218A CN 112445538 B CN112445538 B CN 112445538B
Authority
CN
China
Prior art keywords
configuration
pea
controller
packet
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011472218.9A
Other languages
Chinese (zh)
Other versions
CN112445538A (en
Inventor
尹首一
谢思敏
谷江源
钟鸣
罗列
张淞
韩慧明
刘雷波
魏少军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202011472218.9A priority Critical patent/CN112445538B/en
Publication of CN112445538A publication Critical patent/CN112445538A/en
Application granted granted Critical
Publication of CN112445538B publication Critical patent/CN112445538B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4403Processor initialisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

本发明提供了一种可重构处理器的配置加载系统及方法,该系统包括:配置控制器,用于获取PEA的配置任务所需配置数据的长度和多个配置地址;获取多个配置包并发送至PEA控制器,直至当前获取的配置包的数量等于配置数据的长度;判断当前是否接收到了PEA_CP_Finish信号,若是,预取下一个配置任务的配置数据;PEA控制器,用于从每个配置包中解析出顶层配置信息,确定对应的PE并发送;在接收到当前配置任务的所有PE发送的PE_CP_Finish信号后,发送PEA_CP_Finish信号;PE,用于在每次执行完一个配置包后,发送PE_CP_Finish信号。本发明可以对可重构处理器的配置进行加载,延迟少。

Figure 202011472218

The present invention provides a configuration loading system and method for a reconfigurable processor. The system includes: a configuration controller for obtaining the length of configuration data and multiple configuration addresses required for a configuration task of the PEA; obtaining multiple configuration packages And send it to the PEA controller until the number of configuration packets currently obtained is equal to the length of the configuration data; judge whether the PEA_CP_Finish signal is currently received, if so, prefetch the configuration data of the next configuration task; The top-level configuration information is parsed from the configuration package, and the corresponding PE is determined and sent; after receiving the PE_CP_Finish signal sent by all PEs of the current configuration task, the PEA_CP_Finish signal is sent; PE is used to send a configuration package after each execution. PE_CP_Finish signal. The present invention can load the configuration of the reconfigurable processor with less delay.

Figure 202011472218

Description

Configuration loading system and method for reconfigurable processor
Technical Field
The invention relates to the technical field of computer hardware, in particular to a configuration loading system and a configuration loading method for a reconfigurable processor.
Background
The reconfigurable processor is a novel processor architecture, the hardware function of the reconfigurable processor can realize dynamic configuration, but the configuration delay is increased while the generality is better than that of a special processor.
Therefore, a configuration loading scheme for a reconfigurable processor with less delay is currently required.
Disclosure of Invention
The embodiment of the invention provides a configuration loading system of a reconfigurable processor, which is used for loading the configuration of the reconfigurable processor and has less delay, and the system comprises:
configuring a controller and a processing element array PEA, wherein the PEA comprises a PEA controller and a plurality of processing elements PE;
the configuration controller is used for acquiring the length of configuration data required by a configuration task of the PEA and a plurality of configuration addresses; acquiring a plurality of configuration packets according to a plurality of configuration addresses of the configuration data and sending the configuration packets to the PEA controller until the number of the currently acquired configuration packets is equal to the length of the configuration data; judging whether a PEA _ CP _ Finish signal sent by a PEA controller is received currently, if so, prefetching configuration data of a next configuration task by repeatedly executing the steps; the configuration data comprises a plurality of configuration packets, each configuration packet corresponds to a configuration address, and the length of the configuration data is the number of the configuration packets in the configuration data; the number of addresses in each configuration packet is not more than half of the number of configuration information in each configuration packet;
the PEA controller is used for analyzing top-level configuration information from each configuration package, determining a PE corresponding to each configuration package based on the top-level configuration information and sending the configuration package to the corresponding PE; after receiving PE _ CP _ Finish signals sent by all PEs related to the current configuration task, sending a PEA _ CP _ Finish signal to a configuration controller;
and the PE is used for sending a PE _ CP _ Finish signal to the PEA controller after executing one configuration packet each time.
The embodiment of the invention provides a configuration loading method of a reconfigurable processor, which is used for loading the configuration of the reconfigurable processor and has less delay, and the method comprises the following steps:
acquiring the length of configuration data and a plurality of configuration addresses required by a configuration task of the PEA; the configuration data comprises a plurality of configuration packets, each configuration packet corresponds to a configuration address, and the length of the configuration data is the number of the configuration packets in the configuration data; the number of addresses in each configuration packet is not more than half of the number of configuration information in each configuration packet;
acquiring a plurality of configuration packets according to a plurality of configuration addresses of the configuration data and sending the configuration packets to a PEA controller of a processing unit array PEA until the number of the currently acquired configuration packets is equal to the length of the configuration data; the PEA controller analyzes top-level configuration information from each configuration packet, determines a processing unit PE corresponding to each configuration packet based on the top-level configuration information, sends the configuration packet to the corresponding PE, and sends a PEA _ CP _ Finish signal to the configuration controller after receiving PE _ CP _ Finish signals sent by all PEs related to the current configuration task;
and judging whether a PEA _ CP _ Finish signal sent by the PEA controller is received currently, if so, prefetching configuration data of a next configuration task by repeatedly executing the steps.
The embodiment of the invention also provides computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor realizes the configuration loading method of the reconfigurable processor when executing the computer program.
The embodiment of the invention also provides a computer readable storage medium, and the computer readable storage medium stores a computer program for executing the configuration loading method of the reconfigurable processor.
In an embodiment of the present invention, a configuration loading system of a reconfigurable processor includes: configuring a controller and a processing element array PEA, wherein the PEA comprises a PEA controller and a plurality of processing elements PE; the configuration controller is used for acquiring the length of configuration data required by a configuration task of the PEA and a plurality of configuration addresses; acquiring a plurality of configuration packets according to a plurality of configuration addresses of the configuration data and sending the configuration packets to the PEA controller until the number of the currently acquired configuration packets is equal to the length of the configuration data; judging whether a PEA _ CP _ Finish signal sent by a PEA controller is received currently, if so, prefetching configuration data of a next configuration task by repeatedly executing the steps; the configuration data comprises a plurality of configuration packets, each configuration packet corresponds to a configuration address, and the length of the configuration data is the number of the configuration packets in the configuration data; the PEA controller is used for analyzing top-level configuration information from each configuration package, determining a PE corresponding to each configuration package based on the top-level configuration information and sending the configuration package to the corresponding PE; after receiving PE _ CP _ Finish signals sent by all PEs related to the current configuration task, sending a PEA _ CP _ Finish signal to a configuration controller; and the PE is used for sending a PE _ CP _ Finish signal to the PEA controller after executing one configuration packet each time. In the system, the configuration controller can judge whether the PEA _ CP _ Finish signal sent by the PEA controller is received currently, and if so, the configuration data of the next configuration task is prefetched, so that prefetching and dynamic switching of the configuration data are realized, the configuration time of the PEA is reduced, and less delay is realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:
FIG. 1 is a schematic diagram of a configuration loading system for a reconfigurable processor according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a configuration loading system of a reconfigurable processor according to an embodiment of the present invention;
FIG. 3 is a flow chart of prefetching configuration packets according to an embodiment of the present invention;
FIG. 4 is a flowchart of a PE executing a configuration package according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating the relationship between the read and execution of a configuration packet when prefetching is not performed according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating the relationship between the fetch and execution of a configuration packet during prefetch according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating a configuration packet read from a configuration memory in a ping-pong read manner according to an embodiment of the invention;
FIG. 8 is a flowchart of a configuration loading method for a reconfigurable processor according to an embodiment of the present invention;
FIG. 9 is a diagram of a computer device in an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
In the description of the present specification, the terms "comprising," "including," "having," "containing," and the like are used in an open-ended fashion, i.e., to mean including, but not limited to. Reference to the description of the terms "one embodiment," "a particular embodiment," "some embodiments," "for example," etc., means that a particular feature, structure, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. The sequence of steps involved in the embodiments is for illustrative purposes to illustrate the implementation of the present application, and the sequence of steps is not limited and can be adjusted as needed.
The invention discloses a reconfigurable processor configuration loading scheme supporting prefetching and dynamic switching, which adopts a configuration storage scheme of distributed storage to ensure the integrity of configuration and the parallelism of instruction fetching-execution and reduce the resource consumption and delay of the whole architecture.
Fig. 1 is a schematic diagram of a configuration loading system of a reconfigurable processor according to an embodiment of the present invention, as shown in fig. 1, the system includes:
configuring the controller 11 and the processing unit array PEA12, wherein the PEA12 includes a PEA controller 121 and a plurality of processing units PE 122;
the configuration controller 11 is configured to obtain a length of configuration data and a plurality of configuration addresses required by a configuration task of the PEA; acquiring a plurality of configuration packets according to the plurality of configuration addresses of the configuration data and sending the configuration packets to the PEA controller 121 until the number of the currently acquired configuration packets is equal to the length of the configuration data; judging whether a PEA _ CP _ Finish signal sent by the PEA controller 121 is received currently, if so, prefetching configuration data of a next configuration task by repeatedly executing the steps; the configuration data comprises a plurality of configuration packets, each configuration packet corresponds to a configuration address, and the length of the configuration data is the number of the configuration packets in the configuration data;
the PEA controller 121 is configured to parse top-level configuration information from each configuration packet, determine, based on the top-level configuration information, a PE corresponding to each configuration packet, and send the configuration packet to the corresponding PE 122; after receiving PE _ CP _ Finish signals sent by all PEs 122 related to the current configuration task, sending a PEA _ CP _ Finish signal to the configuration controller;
PE122 is configured to send a PE _ CP _ Finish signal to PEA controller 121 after each execution of a configuration packet.
Therefore, in the embodiment of the present invention, the configuration controller may determine whether the PEA _ CP _ Finish signal sent by the PEA controller is received currently, and if so, prefetch the configuration data of the next configuration task, thereby implementing prefetch and dynamic switching of the configuration data, reducing the configuration time of the PEA, i.e., implementing less delay.
In specific implementation, fig. 2 is a schematic diagram of a configuration loading system of a reconfigurable processor implementing configuration loading in an embodiment of the present invention, and fig. 3 is a flowchart of prefetching a configuration packet in an embodiment of the present invention. In an embodiment, the configuration Controller Context Controller is specifically configured to: the Coprocessor Interface is communicated with a main control (generally referred to as a RISC-V processor core) through a Coprocessor Interface coprocessorinterface, the length CP _ Len and a plurality of configuration addresses CP _ Addr of configuration data required by a configuration task of the PEA are obtained, and an acknowledgement signal CP _ Ld _ Ok is returned to the Coprocessor Interface.
In an embodiment, the configuration controller is specifically configured to: and sequentially sending a configuration address CP _ Addr corresponding to each configuration packet CP _ Data to a primary configuration memory L1 Cache through an AHB-Master bus and an AHB-64 bus, and acquiring each configuration packet CP _ Data. And then, the configuration controller sends the configuration packet CP _ Data to the PEA controller, and the PEA dynamically configures by taking the configuration packet of the PE as a unit.
It should be noted that when the configuration controller accesses the L1 Cache, one configuration address is sent each time to obtain a configuration packet corresponding to the configuration address, and then the configuration address is continuously sent until the number of the configuration packets currently obtained is equal to the length of the configuration data.
After receiving each configuration packet, the PEA controller parses the top-level configuration information, specifically, the PE _ Index, from each configuration packet, determines a PE corresponding to each configuration packet based on the top-level configuration information (i.e., PE _ Index), and sends the configuration packet to the corresponding PE. It can be seen that for a configuration task, not all PEs may receive a configuration packet, for example, if a PEA includes 64 PEs, then only some of the PEs may receive a configuration packet, and some of the PEs may receive one configuration packet, while some of the PEs may receive multiple configuration packets.
In the flowchart corresponding to fig. 3, in comparison with a static acquisition mode during FPGA configuration (acquiring of each configuration requires re-performing power-on reset downloading), here, after acquiring configuration data (multiple configuration packets) required by a configuration task, a processing unit array PEA (generally, 64 PEs are provided for 1 PEA) may perform pre-fetching of the configuration data required by the next configuration task, so that dynamic switching is realized, and configuration information does not need to be re-downloaded after power-on reset. Each PE in the PEA may also obtain the next configuration packet while executing one configuration packet, that is, while performing computation, thereby improving the configuration efficiency.
Fig. 4 is a flowchart illustrating a PE executing a configuration packet according to an embodiment of the present invention, where in an embodiment, the PE includes a PE controller and a configuration memory, where the PE controller is configured to send the configuration packet to the configuration memory after receiving the configuration packet; sequentially executing the configuration packets; after executing a configuration packet each time, sending a PE _ CP _ Finish signal to the PEA controller; and the configuration memory is used for storing the configuration package. The configuration memory is a double-end configuration memory and comprises two ports, wherein one port supports reading operation, and the other port supports writing operation.
After receiving the PE _ CP _ Finish signals sent by all PEs 122 related to the current configuration task, the PEA controller sends the PEA _ CP _ Finish signals to the configuration controller;
in one embodiment, the PE controller is further configured to: after executing all configuration packets in the PE controller, sending a PE _ Task _ Finish signal to the PEA controller;
the PEA controller is further configured to: after receiving PE _ Task _ Finish signals sent by PE controllers of all PEs related to a current configuration Task, sending a PEA _ Task _ Finish signal to the configuration controller;
the configuration controller is further configured to: and after receiving the PEA _ Task _ Finish signal, stopping pre-fetching the configuration data of the next configuration Task.
In an embodiment, the PE controller is specifically configured to: and when the configuration packets are sequentially executed, reading the configuration packets from the configuration memory in a ping-pong reading mode.
In one embodiment, the number of addresses in each configuration packet is no greater than half of the number of pieces of configuration information in each configuration packet.
In the above embodiment, the address in each configuration packet is initially Addr0 or Addr8, and is generally Addr0 by default, and taking the size of each configuration packet as 16 × 64 as an example, that is, a configuration packet stores 16 pieces of configuration information at maximum, and the number of addresses in each configuration packet is not more than half of the number of pieces of configuration information in each configuration packet, that is, the number of addresses in each configuration packet is required to be less than or equal to 8, and only if this condition is satisfied, the prefetch of configuration data required by each configuration task can be implemented.
Taking an example that one configuration packet stores 16 pieces of configuration information at maximum, fig. 5 is a graph showing a relationship between reading and execution of the configuration packet when prefetching is not performed in the embodiment of the present invention, and fig. 6 is a graph showing a relationship between reading and execution of the configuration packet when prefetching is performed in the embodiment of the present invention, in fig. 5, when the number of addresses in the configuration packet is greater than 8 and less than or equal to 16, prefetching and dynamic switching of the configuration packet cannot be performed, and only one configuration packet can be read and then executed, and then read and then executed, and so on. In fig. 6, when the number of addresses in a configuration packet is less than or equal to 8, prefetching and dynamic switching of the configuration packet can be performed, and prefetching is performed while executing one configuration packet, so as to hide the reading time of the configuration packet.
In addition, when the PE controller executes the configuration packets in sequence, the configuration packets are read from the configuration memory in a ping-pong reading mode. Taking an example that a configuration packet stores 16 pieces of configuration information at maximum, fig. 7 is a schematic diagram of reading the configuration packet from the configuration memory in a ping-pong reading manner in the embodiment of the present invention, as shown in fig. 7, (1) when prefetching cannot be supported, an address in the configuration packet starts from 0, and there is no case that the address starts from a start address Addr8 of the high 8 bits of the configuration memory, where 16 pieces of configuration information include top-level configuration information, and each new configuration packet CP is read from an address 0 of the configuration memory CM inside the PE by default; (2) in the case of pre-fetching support, the lower 8 addresses and the upper 8 addresses of the configuration memory CM are made into a ping-pong format, a single processing unit PE includes at most 2 different configuration packets Configuration Package (CP), and a single configuration packet CP includes at most 8 pieces of configuration information, including the top-level configuration information; the ping-pong reading method of the configuration package is that if the address in the first configuration package CP1 read is the lower 8-bit address Addr 0-7 of the configuration memory, when executing the first configuration package CP1, the address Addr 8-15 of the next configuration package CP2 read in the configuration memory; if the address of the second configuration packet CP2 is the 8-bit Addr 8-15 of the configuration memory, the address of the next configuration packet CP3 in the configuration memory is Addr 0-7 while CP2 is executed, and so on. The configuration speed of the whole PEA array can be obviously improved by effectively hiding the configuration time by adopting a ping-pong reading mode.
In summary, the configuration loading system of the reconfigurable processor according to the embodiment of the present invention includes: configuring a controller and a processing element array PEA, wherein the PEA comprises a PEA controller and a plurality of processing elements PE; the configuration controller is used for acquiring the length of configuration data required by a configuration task of the PEA and a plurality of configuration addresses; acquiring a plurality of configuration packets according to a plurality of configuration addresses of the configuration data and sending the configuration packets to the PEA controller until the number of the currently acquired configuration packets is equal to the length of the configuration data; judging whether a PEA _ CP _ Finish signal sent by a PEA controller is received currently, if so, prefetching configuration data of a next configuration task by repeatedly executing the steps; the configuration data comprises a plurality of configuration packets, each configuration packet corresponds to a configuration address, and the length of the configuration data is the number of the configuration packets in the configuration data; the PEA controller is used for analyzing top-level configuration information from each configuration package, determining a PE corresponding to each configuration package based on the top-level configuration information and sending the configuration package to the corresponding PE; after receiving PE _ CP _ Finish signals sent by all PEs related to the current configuration task, sending a PEA _ CP _ Finish signal to a configuration controller; and the PE is used for sending a PE _ CP _ Finish signal to the PEA controller after executing one configuration packet each time. In the system, the configuration controller can judge whether a PEA _ CP _ Finish signal sent by the PEA controller is received currently, and if so, the configuration data of the next configuration task is prefetched, so that the configuration data prefetching and dynamic switching are realized, the configuration time of the PEA is reduced, namely, less delay is realized, and the configuration-execution pipelining can be efficiently realized, thereby hiding the stage of fetching the finger, which is the stage of reading the configuration packet. Under the condition that configuration prefetching exists, the configuration storage structure of the distributed configuration memory of the PE can effectively hide configuration time, and the configuration speed of the whole PEA can be obviously improved.
Based on the same inventive concept, the embodiment of the present invention further provides a configuration loading method for a reconfigurable processor, as described in the following embodiments. Because the principles of solving the problems are similar to those of a configuration loading system of a reconfigurable processor, the implementation of the method can be referred to the implementation of the system, and repeated parts are not described in detail.
Fig. 8 is a flowchart of a configuration loading method for a reconfigurable processor according to an embodiment of the present invention, and as shown in fig. 8, the method includes:
step 801, acquiring the length of configuration data and a plurality of configuration addresses required by a configuration task of a PEA; the configuration data comprises a plurality of configuration packets, each configuration packet corresponds to a configuration address, and the length of the configuration data is the number of the configuration packets in the configuration data;
step 802, obtaining a plurality of configuration packets according to a plurality of configuration addresses of configuration data and sending the configuration packets to a PEA controller of a processing unit array PEA until the number of the currently obtained configuration packets is equal to the length of the configuration data; the PEA controller analyzes top-level configuration information from each configuration packet, determines a processing unit PE corresponding to each configuration packet based on the top-level configuration information, sends the configuration packet to the corresponding PE, and sends a PEA _ CP _ Finish signal to the configuration controller after receiving PE _ CP _ Finish signals sent by all PEs related to the current configuration task;
step 803, judging whether the PEA _ CP _ Finish signal sent by the PEA controller is received currently, if so, prefetching configuration data of a next configuration task by repeatedly executing the steps.
In an embodiment, obtaining the length of the configuration data and the plurality of configuration addresses required by the configuration task of the PEA includes:
and communicating with the master control through the coprocessor interface, acquiring the length of configuration data and a plurality of configuration addresses required by the configuration task of the PEA, and returning a confirmation signal to the coprocessor interface.
In one embodiment, obtaining a plurality of configuration packets according to a plurality of configuration addresses of configuration data includes:
and sequentially sending the configuration address corresponding to each configuration packet to the primary configuration memory through an AHB-Master bus and an AHB-64 bus to obtain each configuration packet.
In one embodiment, the PE includes a PE controller and a configuration memory, wherein,
the PE controller is used for sending the configuration packet to the configuration memory after receiving the configuration packet; sequentially executing the configuration packets; after executing a configuration packet each time, sending a PE _ CP _ Finish signal to the PEA controller;
and the configuration memory is used for storing the configuration package.
In one embodiment, the PE controller is further configured to: after executing all configuration packets in the PE controller, sending a PE _ Task _ Finish signal to the PEA controller;
the PEA controller is further configured to: after receiving PE _ Task _ Finish signals sent by PE controllers of all PEs related to a current configuration Task, sending a PEA _ Task _ Finish signal to the configuration controller;
the method further comprises the following steps: and after receiving the PEA _ Task _ Finish signal, stopping pre-fetching the configuration data of the next configuration Task.
In an embodiment, the PE controller is specifically configured to:
and when the configuration packets are sequentially executed, reading the configuration packets from the configuration memory in a ping-pong reading mode.
In one embodiment, the number of addresses in each configuration packet is less than the number of pieces of configuration information in each configuration packet.
In summary, in the method provided in the embodiment of the present invention, the length of the configuration data and the plurality of configuration addresses required by the configuration task of the PEA are obtained; the configuration data comprises a plurality of configuration packets, each configuration packet corresponds to a configuration address, and the length of the configuration data is the number of the configuration packets in the configuration data; acquiring a plurality of configuration packets according to a plurality of configuration addresses of the configuration data and sending the configuration packets to a PEA controller of a processing unit array PEA until the number of the currently acquired configuration packets is equal to the length of the configuration data; the PEA controller analyzes top-level configuration information from each configuration packet, determines a processing unit PE corresponding to each configuration packet based on the top-level configuration information, sends the configuration packet to the corresponding PE, and sends a PEA _ CP _ Finish signal to the configuration controller after receiving PE _ CP _ Finish signals sent by all PEs related to the current configuration task; and judging whether a PEA _ CP _ Finish signal sent by the PEA controller is received currently, if so, prefetching configuration data of a next configuration task by repeatedly executing the steps. Therefore, whether the PEA _ CP _ Finish signal sent by the PEA controller is received or not can be judged, if yes, the configuration data of the next configuration task is prefetched, so that the prefetching and the dynamic switching of the configuration data are realized, the configuration time of the PEA is reduced, namely, less delay is realized, the 'configuration-execution' pipelining can be efficiently realized, and the 'instruction fetching' stage is hidden. Under the condition that configuration prefetching exists, the configuration storage structure of the distributed configuration memory of the PE can effectively hide configuration time, and the configuration speed of the whole PEA can be obviously improved.
An embodiment of the present application further provides a computer device, and fig. 9 is a schematic diagram of a computer device in an embodiment of the present invention, where the computer device is capable of implementing all steps in the configuration loading method of the reconfigurable processor in the foregoing embodiment, and the computer device specifically includes the following contents:
a processor (processor)901, a memory (memory)902, a communication Interface (Communications Interface)903, and a communication bus 904;
the processor 901, the memory 902 and the communication interface 903 complete mutual communication through the communication bus 904; the communication interface 903 is used for realizing information transmission among related devices such as server-side devices, detection devices, user-side devices and the like;
the processor 901 is configured to call the computer program in the memory 902, and when the processor executes the computer program, the processor implements all the steps in the configuration loading method of the reconfigurable processor in the above embodiments.
An embodiment of the present application further provides a computer-readable storage medium, which can implement all the steps in the configuration loading method for the reconfigurable processor in the above embodiment, where the computer-readable storage medium stores a computer program, and when the computer program is executed by the processor, the computer program implements all the steps of the configuration loading method for the reconfigurable processor in the above embodiment.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (9)

1.一种可重构处理器的配置加载系统,其特征在于,包括:1. A configuration loading system for a reconfigurable processor, comprising: 配置控制器和处理单元阵列PEA,其中,PEA包括PEA控制器和多个处理单元PE;Configuring a controller and a processing unit array PEA, wherein the PEA includes a PEA controller and a plurality of processing units PE; 配置控制器,用于获取PEA的配置任务所需配置数据的长度和多个配置地址;根据配置数据的多个配置地址获取多个配置包并发送至PEA控制器,直至当前获取的配置包的数量等于配置数据的长度;判断当前是否接收到了PEA控制器发送的PEA_CP_Finish信号,若是,通过重复执行以上步骤,预取下一个配置任务的配置数据;所述配置数据包括多个配置包,每个配置包对应一个配置地址,配置数据的长度为配置数据中配置包的数量;每个配置包中地址的数量不大于每个配置包中配置信息的条数的一半;The configuration controller is used to obtain the length and multiple configuration addresses of the configuration data required for the configuration task of the PEA; obtain multiple configuration packets according to the multiple configuration addresses of the configuration data and send them to the PEA controller until the The number is equal to the length of the configuration data; it is judged whether the PEA_CP_Finish signal sent by the PEA controller is currently received, and if so, by repeating the above steps, the configuration data of the next configuration task is prefetched; the configuration data includes a plurality of configuration packets, each The configuration packet corresponds to a configuration address, and the length of the configuration data is the number of configuration packets in the configuration data; the number of addresses in each configuration packet is not greater than half of the number of configuration information in each configuration packet; PEA控制器,用于从每个配置包中解析出顶层配置信息,基于该顶层配置信息确定每个配置包对应的PE并将该配置包发送至对应的PE;在接收到当前配置任务所涉及的所有PE发送的PE_CP_Finish信号后,向配置控制器发送PEA_CP_Finish信号;The PEA controller is used to parse the top-level configuration information from each configuration package, determine the PE corresponding to each configuration package based on the top-level configuration information, and send the configuration package to the corresponding PE; After all PEs send the PE_CP_Finish signal, send the PEA_CP_Finish signal to the configuration controller; PE,用于在每次执行完一个配置包后,向PEA控制器发送一个PE_CP_Finish信号。PE is used to send a PE_CP_Finish signal to the PEA controller after each configuration packet is executed. 2.如权利要求1所述的可重构处理器的配置加载系统,其特征在于,配置控制器具体用于:2. The configuration loading system of the reconfigurable processor according to claim 1, wherein the configuration controller is specifically used for: 通过协处理器接口与主控进行通信,获取PEA的配置任务所需配置数据的长度和多个配置地址,并向协处理器接口返回确认信号。It communicates with the master through the coprocessor interface, obtains the length of the configuration data and multiple configuration addresses required for the configuration task of the PEA, and returns an acknowledgement signal to the coprocessor interface. 3.如权利要求1所述的可重构处理器的配置加载系统,其特征在于,配置控制器具体用于:3. The configuration loading system of the reconfigurable processor according to claim 1, wherein the configuration controller is specifically used for: 通过AHB-Master总线和AHB-64总线,依次向一级配置存储器发送每个配置包对应的配置地址,获取每个配置包。Through the AHB-Master bus and the AHB-64 bus, the configuration address corresponding to each configuration packet is sent to the first-level configuration memory in turn to obtain each configuration packet. 4.如权利要求1所述的可重构处理器的配置加载系统,其特征在于,PE包括PE控制器和配置存储器,其中,4. The configuration loading system of a reconfigurable processor according to claim 1, wherein the PE comprises a PE controller and a configuration memory, wherein, PE控制器,用于在接收到配置包后,将配置包发送至配置存储器;顺序执行配置包;在每次执行完一个配置包后,向PEA控制器发送一个PE_CP_Finish信号;The PE controller is used to send the configuration packet to the configuration memory after receiving the configuration packet; execute the configuration packet in sequence; after each execution of a configuration packet, send a PE_CP_Finish signal to the PEA controller; 配置存储器,用于存储配置包。Configuration memory for storing configuration packages. 5.如权利要求4所述的可重构处理器的配置加载系统,其特征在于,PE控制器还用于:在执行完该PE控制器中所有配置包后,向PEA控制器发送一个PE_Task_Finish信号;5. The configuration loading system of a reconfigurable processor as claimed in claim 4, wherein the PE controller is also used for: after executing all configuration packages in the PE controller, send a PE_Task_Finish to the PEA controller Signal; PEA控制器还用于:接收到当前配置任务所涉及的所有PE的PE控制器发送的PE_Task_Finish信号后,向配置控制器发送PEA_Task_Finish信号;The PEA controller is also used to: send the PEA_Task_Finish signal to the configuration controller after receiving the PE_Task_Finish signal sent by the PE controller of all PEs involved in the current configuration task; 配置控制器还用于:在接收到PEA_Task_Finish信号后,停止预取下一个配置任务的配置数据。The configuration controller is also used to stop prefetching the configuration data of the next configuration task after receiving the PEA_Task_Finish signal. 6.如权利要求4所述的可重构处理器的配置加载系统,其特征在于,PE控制器具体用于:6. The configuration loading system of a reconfigurable processor according to claim 4, wherein the PE controller is specifically used for: 在顺序执行配置包时,采用乒乓读取方式从配置存储器中读取配置包。When the configuration packets are sequentially executed, the configuration packets are read from the configuration memory in a ping-pong reading manner. 7.一种可重构处理器的配置加载方法,其特征在于,包括:7. A configuration loading method for a reconfigurable processor, comprising: 获取PEA的配置任务所需配置数据的长度和多个配置地址;所述配置数据包括多个配置包,每个配置包对应一个配置地址,配置数据的长度为配置数据中配置包的数量;每个配置包中地址的数量不大于每个配置包中配置信息的条数的一半;Obtain the length of the configuration data and multiple configuration addresses required for the configuration task of the PEA; the configuration data includes multiple configuration packets, each configuration packet corresponds to a configuration address, and the length of the configuration data is the number of configuration packets in the configuration data; The number of addresses in each configuration package is not more than half of the number of pieces of configuration information in each configuration package; 根据配置数据的多个配置地址获取多个配置包并发送至处理单元阵列PEA的PEA控制器,直至当前获取的配置包的数量等于配置数据的长度;所述PEA控制器从每个配置包中解析出顶层配置信息,基于该顶层配置信息确定每个配置包对应的处理单元PE并将该配置包发送至对应的PE,在接收到当前配置任务所涉及的所有PE发送的PE_CP_Finish信号后,向配置控制器发送PEA_CP_Finish信号;Obtain multiple configuration packets according to the multiple configuration addresses of the configuration data and send them to the PEA controller of the processing unit array PEA, until the number of currently obtained configuration packets is equal to the length of the configuration data; the PEA controller selects each configuration packet from the Parse out the top-level configuration information, determine the processing unit PE corresponding to each configuration packet based on the top-level configuration information, and send the configuration packet to the corresponding PE. After receiving the PE_CP_Finish signal sent by all PEs involved in the current configuration task, send the Configure the controller to send the PEA_CP_Finish signal; 判断当前是否接收到了PEA控制器发送的PEA_CP_Finish信号,若是,通过重复执行以上步骤,预取下一个配置任务的配置数据。It is judged whether the PEA_CP_Finish signal sent by the PEA controller is currently received, and if so, the configuration data of the next configuration task is prefetched by repeating the above steps. 8.一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求7所述方法。8. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the method of claim 7 when executing the computer program. 9.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有执行权利要求7所述方法的计算机程序。9. A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program for executing the method of claim 7.
CN202011472218.9A 2020-12-15 2020-12-15 Configuration loading system and method for reconfigurable processor Active CN112445538B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011472218.9A CN112445538B (en) 2020-12-15 2020-12-15 Configuration loading system and method for reconfigurable processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011472218.9A CN112445538B (en) 2020-12-15 2020-12-15 Configuration loading system and method for reconfigurable processor

Publications (2)

Publication Number Publication Date
CN112445538A CN112445538A (en) 2021-03-05
CN112445538B true CN112445538B (en) 2021-11-30

Family

ID=74739916

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011472218.9A Active CN112445538B (en) 2020-12-15 2020-12-15 Configuration loading system and method for reconfigurable processor

Country Status (1)

Country Link
CN (1) CN112445538B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113190497B (en) * 2021-04-09 2022-09-09 珠海一微半导体股份有限公司 Task processing method of reconfigurable processor and reconfigurable processor
CN115129393B (en) * 2022-07-06 2023-04-25 北京中科海芯科技有限公司 Application configuration determining method and device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514140A (en) * 2013-08-05 2014-01-15 东南大学 Reconfiguration controller for massively transmitting configuration information in reconfigurable system
CN105653474A (en) * 2015-12-29 2016-06-08 东南大学—无锡集成电路技术研究所 Coarse-grained dynamic reconfigurable processor-oriented configuration cache controller

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7502920B2 (en) * 2000-10-03 2009-03-10 Intel Corporation Hierarchical storage architecture for reconfigurable logic configurations
JP4546775B2 (en) * 2004-06-30 2010-09-15 富士通株式会社 Reconfigurable circuit capable of time-division multiplex processing
KR101076869B1 (en) * 2010-03-16 2011-10-25 광운대학교 산학협력단 Memory centric communication apparatus in coarse grained reconfigurable array
CN102968390B (en) * 2012-12-13 2015-02-18 东南大学 Configuration information cache management method and system based on decoding analysis in advance

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514140A (en) * 2013-08-05 2014-01-15 东南大学 Reconfiguration controller for massively transmitting configuration information in reconfigurable system
CN105653474A (en) * 2015-12-29 2016-06-08 东南大学—无锡集成电路技术研究所 Coarse-grained dynamic reconfigurable processor-oriented configuration cache controller

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CCP: Configuration Context based Prefetching to Improve Coarse-Grained Reconfigurable Array Performance;Chen Yang ET AL.;《2019 26th IEEE International Conference on Electronics, Circuits and Systems (ICECS)》;20200123;107-108 *
CDPM: Context-Directed Pattern Matching Prefetching to Improve Coarse-Grained Reconfigurable Array Performance;Liu LB ET AL.;《IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS》;20180630;第37卷(第6期);1171-1184 *
基于预配置和配置重用的粗粒度动态可重构系统任务调度技术;戴紫彬等;《电子与信息学报》;20190228(第06期);1458-1565 *

Also Published As

Publication number Publication date
CN112445538A (en) 2021-03-05

Similar Documents

Publication Publication Date Title
US11074083B2 (en) Fast loading kernel image file for booting
CN109219805B (en) Memory access method, related device, system and storage medium of multi-core system
CN111158756B (en) Method and apparatus for processing information
CN110688160B (en) Instruction pipeline processing method, system, equipment and computer storage medium
CN112445538B (en) Configuration loading system and method for reconfigurable processor
CN103189853B (en) For the method and apparatus providing efficient context classification
US20130036426A1 (en) Information processing device and task switching method
KR101757355B1 (en) Method and apparatus for cache access mode selection
CN112667289A (en) CNN reasoning acceleration system, acceleration method and medium
EP2959402B1 (en) Architecture for remote access to content state
CN115437696B (en) Self-adaptive configuration method and device for trusted platform
CN108108203B (en) Method, device and system for installation package download and installation process
CN111580976A (en) VASP resource calling method, system, equipment and medium
CN112559403A (en) Processor and interrupt controller therein
CN113396391B (en) Application startup method, device, electronic device and storage medium
EP2541404B1 (en) Technique for task sequence execution
CN115242563A (en) Network communication method, computing device and readable storage medium
CN102737009B (en) A kind of FFT twiddle factor generation device and application process thereof
CN111930510A (en) Electronic device and data processing method
CN118151077A (en) Magnetic resonance imaging equipment testing method, device, storage medium and electronic equipment
CN117724852A (en) Cloud computer computing resource allocation method and device
CN112612531A (en) Application program starting method and device, electronic equipment and storage medium
WO2022126361A1 (en) Configuration loading system and method for reconfigurable processor
CN111476663B (en) Data processing method and device, node equipment and storage medium
CN118672965A (en) Signal processing method, device, terminal, chip, storage medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant