[go: up one dir, main page]

CN113467834A - Vector data access method and system under riscv-v instruction set architecture - Google Patents

Vector data access method and system under riscv-v instruction set architecture Download PDF

Info

Publication number
CN113467834A
CN113467834A CN202110734736.1A CN202110734736A CN113467834A CN 113467834 A CN113467834 A CN 113467834A CN 202110734736 A CN202110734736 A CN 202110734736A CN 113467834 A CN113467834 A CN 113467834A
Authority
CN
China
Prior art keywords
data path
vector data
riscv
vector
set architecture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110734736.1A
Other languages
Chinese (zh)
Inventor
姚慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Saifang Technology Co ltd
Original Assignee
Guangdong Saifang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Saifang Technology Co ltd filed Critical Guangdong Saifang Technology Co ltd
Priority to CN202110734736.1A priority Critical patent/CN113467834A/en
Publication of CN113467834A publication Critical patent/CN113467834A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

本发明涉及处理器技术领域,具体涉及一种riscv‑v指令集架构下的向量数据通路方法及系统,包括指令发射部件ISS、vector寄存器文件VRF、重排序缓冲部件ROB和vector非访存指令数据通路VUD。本发明向量数据通路设计简单,且相对独立流水线前段,可移植性强。本发明一套代码对应多套硬件,通用性强。本发明VLEN完全可配,应用场景灵活,处理器迭代成本低。本发明Vector数据通路流水执行,执行节拍短,处理器性能高,且DLEN可配置,可应对不同规模的处理器设计场景。本发明设计简单,bug收敛快,处理器设计周期短。

Figure 202110734736

The invention relates to the technical field of processors, in particular to a vector data path method and system under the riscv-v instruction set architecture, including an instruction transmitting part ISS, a vector register file VRF, a reordering buffer part ROB and a vector non-access memory instruction data Pathway VUD. The vector data path of the present invention is simple in design, and has strong portability relative to the front section of the independent assembly line. In the present invention, one set of codes corresponds to multiple sets of hardware, and the universality is strong. The VLEN of the present invention is completely configurable, has flexible application scenarios, and has low processor iteration cost. The Vector data path of the present invention is executed in pipeline, the execution cycle is short, the processor performance is high, and the DLEN is configurable, which can cope with processor design scenarios of different scales. The invention has simple design, fast bug convergence, and short processor design cycle.

Figure 202110734736

Description

Vector data access method and system under riscv-v instruction set architecture
Technical Field
The invention relates to the technical field of processors, in particular to a vector data access method and system under a riscv-v instruction set architecture.
Background
In the prior art, the following defects exist:
1. the vector data path often depends on the front end of the pipeline in processing flow. Once the processor architecture is changed, the vector data path is often changed along with the change, the independence is poor, and the transportability is poor;
2. one set of codes can only generate one set of hardware, and the universality is poor;
3. one VLEN parameter configuration corresponds to a vector data path structure, the application scene is single, and the iteration cost of the processor is high;
4. when VLEN is large, the data path is generally non-pipelined and has long execution beats, low processor performance, or large area design, which is not friendly to small-scale processors.
5. The design is complex, the bug convergence is slow, and the design period of the processor is long.
Thus, the present application is directed to the riscv-v instruction set, implementing a flexibly configurable (VLEN, DLEN) vector datapath; a set of relatively independent vector data paths is designed, the method is applicable to different processor design scenes, the interface is simple, code transplantation is facilitated, and processor iteration is facilitated.
Disclosure of Invention
Aiming at the defects of the prior art, the invention discloses a vector data access method and a system under a riscv-v instruction set architecture, which are used for realizing a flexibly configurable (VLEN, DLEN) vector data access aiming at a riscv-v instruction set; a set of relatively independent vector data paths is designed, the method is applicable to different processor design scenes, the interface is simple, code transplantation is facilitated, and processor iteration is facilitated.
The invention is realized by the following technical scheme:
the invention discloses a vector data access method under a riscv-v instruction set architecture, which comprises the following steps of:
the S1 processor is operative to issue command information to VUD via the command issue unit ISS;
s2 utilizes VUD to receive instruction information from ISS components and read source operands from VRFs;
s3, splitting the instruction after the register reading into n uops in the REISSUE, and sending the uops to VBOB to be executed in sequence;
s4 retrieves VBOB execution results from the pack station while writing back to VRF and reporting completion status to ROB.
Further, in the method, the data widths of VRF, REISSUE and PACKAGE are all VLEN.
Further, VLEN is a vector register width.
Further, in the method, the data width of VBOB is DLEN.
Further, the DLEN is a vector datapath width.
Further, in the method, the PAKAGE station is temporally coincident with VBOB.
Further, in the method, VUD is a vector non-access instruction data path.
Furthermore, in the method, the Vector data path is executed in a pipelining mode, the execution beat is short, and DLEN can be configured to be used for dealing with processor design scenes of different scales.
Furthermore, in the (VLEN, DLEN) configuration, VLEN is greater than or equal to DLEN, and when VLEN is configured as DLEN, the pipeline execution is performed, the execution beat is short, and the method is applied to a scene pursuing high performance; and DLEN < VLEN is configured, the data path area is small, the power consumption is low, and the method is applied to small-scale and low-power-consumption scenes.
In a second aspect, the present invention discloses a vector data path system under a riscv-v instruction set architecture, where the system is configured to implement the vector data path method under the riscv-v instruction set architecture in the first aspect, and includes an instruction issue unit ISS, a vector register file VRF, a reorder buffer unit ROB, and a vector non-access instruction data path VUD.
The invention has the beneficial effects that:
1. the vector data path is simple in design, and is relatively independent of the front section of the pipeline, so that the transportability is strong.
2. The invention has one set of codes corresponding to a plurality of sets of hardware and strong universality.
3. The VLEN is completely configurable, the application scene is flexible, and the iteration cost of the processor is low.
4. The Vector data path is executed in a pipelining mode, the execution beat is short, the performance of the processor is high, DLEN can be configured, and the Vector data path can be used for processor design scenes of different scales.
5. The invention has simple design, fast bug convergence and short processor design period.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a diagram of the principle steps of a vector data path method under a riscv-v instruction set architecture;
FIG. 2 is a schematic diagram of a vector datapath system under a riscv-v instruction set architecture;
fig. 3 is a schematic diagram of the execution of the embodiment VUD of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
The embodiment discloses a vector data path method under a riscv-v instruction set architecture as shown in fig. 1, which includes the following steps:
the S1 processor is operative to issue command information to VUD via the command issue unit ISS;
s2 utilizes VUD to receive instruction information from ISS components and read source operands from VRFs;
s3, splitting the instruction after the register reading into n uops in the REISSUE, and sending the uops to VBOB to be executed in sequence;
s4 retrieves VBOB execution results from the pack station while writing back to VRF and reporting completion status to ROB.
In this embodiment, the data widths of VRF, REISSUE, and pack are VLEN. Where VLEN is the vector register width.
In this embodiment, the data width of VBOB is DLEN. Where DLEN is the vector datapath width.
In this embodiment, the PAKAGE station is temporally coincident with VBOB.
In this embodiment, VUD is a vector non-access instruction data path.
In this embodiment, the Vector data path is executed in a pipelined manner, the execution beat is short, and DLEN is configurable and is used for dealing with processor design scenarios of different scales.
Example 2
The embodiment discloses a vector data path system under a riscv-v instruction set architecture as shown in fig. 2, which includes an instruction issue unit ISS, a vector register file VRF, a reorder buffer unit ROB, and a vector non-access instruction data path VUD.
In this embodiment, iss (instruction issue) is an instruction issue Unit, vrf (vector Register file) is a vector Register file, a single Register stores data of VLEN length, ROB (Re-order Buffer) reorder Buffer Unit, and vud (vector Unit datapath) is a vector non-access instruction data path.
In this embodiment, vlen (vector register length) is the vector register width, and dlen (datapath length) is referred to herein as the vector datapath width. VRF and REISSUE, PACKAGE has a data width of VLEN and VBOB has a data width of DLEN.
In this embodiment, VUD executes beats as shown in FIG. 3, VUD receives instruction information from the ISS unit and reads source operands from the VRF.
In this embodiment, the REISSUE splits the instruction into n uops, and sends the n uops to VBOB in sequence for execution, and the executed result is recovered by the pack station, written back to VRF, and reports the completion status to ROB.
In this embodiment, the PAKAGE station is completed with VBOB in time sequence, VUD is running water, and the performance is higher. Interfaces among all components are simple, independence is high, main function realization of the instructions is centralized on the VBOB component, and correctness of instruction functions related to the VLEN is mainly guaranteed by other components. The design structure is convenient for bottom verification, the verification of the sub-modules can be simultaneously carried out in the design process, the bug can be quickly converged, and the design period of the processor is shortened.
Therefore, the VLEN can be configured in the embodiment, the characteristic of vector change is met, and the flexibility and the universality are high; the REISSUE station splits data before executing the instruction, executes the data with the granularity of uop, and recovers the data with the granularity of uop at the PACKAGE station to realize the control of transmitting and recovering the data; DLEN is configurable, namely the area of a vector data path is configurable, and the power consumption is controllable.
In conclusion, the vector data path of the invention has simple design and strong portability compared with the front segment of the independent pipeline. The invention has one set of codes corresponding to a plurality of sets of hardware and strong universality. The VLEN is completely configurable, the application scene is flexible, and the iteration cost of the processor is low. The Vector data path is executed in a pipelining mode, the execution beat is short, the performance of the processor is high, DLEN can be configured, and the Vector data path can be used for processor design scenes of different scales. The invention has simple design, fast bug convergence and short processor design period.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1.一种riscv-v指令集架构下的向量数据通路方法,其特征在于,所述方法包括以下步骤:1. a vector data path method under a riscv-v instruction set architecture, is characterized in that, described method comprises the following steps: S1处理器工作,通过指令发射部件ISS发射指令信息至VUD;The S1 processor works and transmits the instruction information to the VUD through the instruction transmitting part ISS; S2利用VUD接收从ISS部件发来的指令信息,并从VRF中读出源操作数;S2 uses the VUD to receive the instruction information sent from the ISS component, and reads the source operand from the VRF; S3在REISSUE将读完寄存器后的指令拆分为n个uop,并依次发向VBOB去执行;S3 splits the instruction after reading the register into n uops in REISSUE, and sends them to VBOB for execution in turn; S4由PACKAGE站台回收VBOB执行结果,同时写回到VRF,并向ROB报告完成状态。In S4, the PACKAGE station reclaims the VBOB execution result, writes it back to the VRF at the same time, and reports the completion status to the ROB. 2.根据权利要求1所述的riscv-v指令集架构下的向量数据通路方法,其特征在于,所述方法中,VRF、REISSUE和PACKAGE的数据宽度都为VLEN。2. The vector data path method under the riscv-v instruction set architecture according to claim 1, wherein in the method, the data widths of VRF, REISSUE and PACKAGE are all VLEN. 3.根据权利要求2所述的riscv-v指令集架构下的向量数据通路方法,其特征在于,所述VLEN是向量寄存器宽度。3. The vector data path method under the riscv-v instruction set architecture according to claim 2, wherein the VLEN is a vector register width. 4.根据权利要求1所述的riscv-v指令集架构下的向量数据通路方法,其特征在于,所述方法中,VBOB的数据宽度是DLEN。4. The vector data path method under the riscv-v instruction set architecture according to claim 1, wherein in the method, the data width of VBOB is DLEN. 5.根据权利要求4所述的riscv-v指令集架构下的向量数据通路方法,其特征在于,所述DLEN是向量数据通路宽度。5 . The vector data path method under the riscv-v instruction set architecture according to claim 4 , wherein the DLEN is the vector data path width. 6 . 6.根据权利要求1所述的riscv-v指令集架构下的向量数据通路方法,其特征在于,所述方法中,PAKAGE站台在时序上与VBOB重合。6 . The vector data path method under the riscv-v instruction set architecture according to claim 1 , wherein, in the method, the PAKAGE station coincides with the VBOB in time sequence. 7 . 7.根据权利要求1所述的riscv-v指令集架构下的向量数据通路方法,其特征在于,所述方法中,所述VUD是vector非访存指令数据通路。7 . The vector data path method under the riscv-v instruction set architecture according to claim 1 , wherein, in the method, the VUD is a vector non-memory access instruction data path. 8 . 8.根据权利要求1所述的riscv-v指令集架构下的向量数据通路方法,其特征在于,所述方法中,Vector数据通路流水执行,执行节拍短,且DLEN可配置,用于应对不同规模的处理器设计场景。8. The vector data path method under the riscv-v instruction set architecture according to claim 1, wherein in the method, the Vector data path is executed in pipeline, and the execution rhythm is short, and DLEN is configurable, for coping with different Scale processor design scenarios. 9.根据权利要求8所述的riscv-v指令集架构下的向量数据通路方法,其特征在于,(VLEN,DLEN)配置中,VLEN大于或等于DLEN,当配置VLEN=DLEN,流水执行,执行节拍短,应用于追求性能高的场景;配置DLEN<VLEN,数据通路面积小,功耗低,应用于小规模,低功耗场景。9. the vector data path method under the riscv-v instruction set architecture according to claim 8, is characterized in that, in (VLEN, DLEN) configuration, VLEN is greater than or equal to DLEN, when configuration VLEN=DLEN, pipeline execution, execution The cycle time is short, and it is used in scenarios where high performance is pursued; when DLEN<VLEN is configured, the data path area is small and the power consumption is low, and it is used in small-scale and low-power scenarios. 10.一种riscv-v指令集架构下的向量数据通路系统,所述系统用于实现如权利要求1-9任一项所述的riscv-v指令集架构下的向量数据通路方法,其特征在于,包括指令发射部件ISS、vector寄存器文件VRF、重排序缓冲部件ROB和vector非访存指令数据通路VUD。10. a vector data path system under a riscv-v instruction set architecture, the system is used to realize the vector data path method under the riscv-v instruction set architecture as claimed in any one of claims 1-9, it is characterized in that It includes the instruction issuing part ISS, the vector register file VRF, the reordering buffer part ROB and the vector non-access instruction data path VUD.
CN202110734736.1A 2021-06-30 2021-06-30 Vector data access method and system under riscv-v instruction set architecture Pending CN113467834A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110734736.1A CN113467834A (en) 2021-06-30 2021-06-30 Vector data access method and system under riscv-v instruction set architecture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110734736.1A CN113467834A (en) 2021-06-30 2021-06-30 Vector data access method and system under riscv-v instruction set architecture

Publications (1)

Publication Number Publication Date
CN113467834A true CN113467834A (en) 2021-10-01

Family

ID=77874292

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110734736.1A Pending CN113467834A (en) 2021-06-30 2021-06-30 Vector data access method and system under riscv-v instruction set architecture

Country Status (1)

Country Link
CN (1) CN113467834A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020087846A1 (en) * 2000-11-06 2002-07-04 Nickolls John R. Reconfigurable processing system and method
US7047394B1 (en) * 1999-01-28 2006-05-16 Ati International Srl Computer for execution of RISC and CISC instruction sets
US20110320765A1 (en) * 2010-06-28 2011-12-29 International Business Machines Corporation Variable width vector instruction processor
CN104714778A (en) * 2011-04-07 2015-06-17 威盛电子股份有限公司 Method for operating microprocessor
CN107992330A (en) * 2012-12-31 2018-05-04 英特尔公司 Processor, method, processing system and the machine readable media for carrying out vectorization are circulated to condition
CN111506347A (en) * 2020-03-27 2020-08-07 上海赛昉科技有限公司 Renaming method based on instruction read-after-write correlation hypothesis
CN111563281A (en) * 2020-04-30 2020-08-21 北京中科晶上科技股份有限公司 Processor supporting multiple encryption and decryption algorithms and implementation method thereof
US20210173738A1 (en) * 2019-12-09 2021-06-10 SiFive, Inc. Checker Cores for Fault Tolerant Processing

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7047394B1 (en) * 1999-01-28 2006-05-16 Ati International Srl Computer for execution of RISC and CISC instruction sets
US20020087846A1 (en) * 2000-11-06 2002-07-04 Nickolls John R. Reconfigurable processing system and method
US20110320765A1 (en) * 2010-06-28 2011-12-29 International Business Machines Corporation Variable width vector instruction processor
CN104714778A (en) * 2011-04-07 2015-06-17 威盛电子股份有限公司 Method for operating microprocessor
CN107992330A (en) * 2012-12-31 2018-05-04 英特尔公司 Processor, method, processing system and the machine readable media for carrying out vectorization are circulated to condition
US20210173738A1 (en) * 2019-12-09 2021-06-10 SiFive, Inc. Checker Cores for Fault Tolerant Processing
CN111506347A (en) * 2020-03-27 2020-08-07 上海赛昉科技有限公司 Renaming method based on instruction read-after-write correlation hypothesis
CN111563281A (en) * 2020-04-30 2020-08-21 北京中科晶上科技股份有限公司 Processor supporting multiple encryption and decryption algorithms and implementation method thereof

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
CHEN, CHEN AND XIANG, XIAOYAN AND LIU等: "Xuantie-910: a commercial multi-core 12-stage pipeline out-of-order 64-bit high performance RISC-V processor with vector extension", PROCEEDINGS OF THE ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, vol. 2020, 23 September 2020 (2020-09-23) *
SIFIVE: "SiFive Performance P550 Core Sets New Standard as Highest Performance RISC-V Processor IP", Retrieved from the Internet <URL:https://www.sifive.cn/press/sifive-performance-p550-core-sets-new-standard-as-highest> *
剧毒术士马文: "SiFive发布P550/P270 RISC-V架构核, Intel 7nm制程平台2022年登场", pages 1 - 4, Retrieved from the Internet <URL:https://moepc.net/sifive-releases-p550-p270-risc-v-architecture-core-intel-7nm-process-platform-to-debut-on-2022/> *
张昆藏: "计算机系统结构教程", 31 January 2002, 国防工业出版社, pages: 90 - 93 *
王艳震: "基于BOOM超标量处理器可配置参数的性能优化研究", 中国优秀硕士学位论文全文数据库——信息科技, vol. 2020, no. 02, 15 February 2020 (2020-02-15) *
陈宏铭教授: "矢量处理SiFive智能内核", pages 1 - 2, Retrieved from the Internet <URL:https://www.bilibili.com/video/BV1HE411M7M3/> *

Similar Documents

Publication Publication Date Title
US10216693B2 (en) Computer with hybrid Von-Neumann/dataflow execution architecture
US9081564B2 (en) Converting scalar operation to specific type of vector operation using modifier instruction
US8677102B2 (en) Instruction fusion calculation device and method for instruction fusion calculation
US11593241B2 (en) Processor with debug pipeline
US20180267798A1 (en) Move prefix instruction
CN111124360B (en) Accelerator capable of configuring matrix multiplication
US11048516B2 (en) Systems, methods, and apparatuses for last branch record support compatible with binary translation and speculative execution using an architectural bit array and a write bit array
TW201732566A (en) Method and apparatus for recovering from bad store-to-load forwarding in an out-of-order processor
JP2001209535A (en) Instruction scheduling device for processor
CN108628693A (en) Processor debugging method and system
CN107315575A (en) A device and method for performing vector merge operation
US12020033B2 (en) Apparatus and method for hardware-based memoization of function calls to reduce instruction execution
US10338926B2 (en) Processor with conditional instructions
CN110928577B (en) An Execution Method of Vector Storage Instruction with Exception Return
EP0497485A2 (en) Computer for implementing two-operand instructions
CN113467834A (en) Vector data access method and system under riscv-v instruction set architecture
CN111742296B (en) Apparatus, method and computer readable storage medium for data processing
CN114357535A (en) A RISC-V-based secure and trusted encryption processor architecture and its working method
CN114186517A (en) Design method of customizable RISC-V architecture processor oriented to embedded scene
US7673294B2 (en) Mechanism for pipelining loops with irregular loop control
CN116841614B (en) Sequential vector scheduling method under disordered access mechanism
CN119759423A (en) Instruction processing method, apparatus, processor, medium, program, and computer device
CN113853584B (en) Variable latency instruction
JP6347629B2 (en) Instruction processing method and instruction processing apparatus
US20140365751A1 (en) Operand generation in at least one processing pipeline

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination