CN111414199B

CN111414199B - A method and device for realizing instruction fusion

Info

Publication number: CN111414199B
Application number: CN202010260633.1A
Authority: CN
Inventors: 孙彩霞; 郑重; 隋兵才; 邓全; 郭辉; 郭维; 雷国庆; 王俊辉; 黄立波; 倪晓强; 王永文
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2020-04-03
Filing date: 2020-04-03
Publication date: 2022-11-08
Anticipated expiration: 2040-04-03
Also published as: CN111414199A

Abstract

The invention relates to microprocessor design technology, in particular to a method and device for implementing instruction fusion. The method of the invention combines a prefix instruction and the next instruction that can be fused with it into a fusion instruction in the decoding stage, and the source operation of the prefix instruction The number becomes a source operand of the fused instruction, replacing the implicit source operand. If the fused instruction is submitted normally in the commit phase, the commit address is advanced by two instructions. If an exception occurs in the fused instruction, the exception is reported at the prefix instruction. The prefix instruction Architecture state is not updated, and the exception return address is the address of the prefixed instruction. The invention can effectively realize instruction fusion, has simple design, and can ensure accurate exception.

Description

A method and device for realizing instruction fusion

技术领域technical field

本发明涉及微处理器设计技术，具体涉及一种指令融合的实现方法及装置。The invention relates to microprocessor design technology, in particular to a method and device for realizing instruction fusion.

背景技术Background technique

指令长度固定的处理器体系结构在扩充新的指令时，由于指令条数增多，需要更多的编码位作为操作码来表示指令功能，去除操作码后，指令编码中剩余的位能够表示的操作数个数有限，从而导致可能无法正常编码某些指令。比如，正常情况下，一条浮点乘加指令需要3个源操作数和1个目的操作数，我们用FPMA Rd, Rs1, Rs2, Rs3表示，该指令完成的功能是Rd=(Rs1xRs2)+Rs3。体系结构通常都定义了32个软件可见的浮点寄存器，一个浮点寄存器操作数需要5位表示，3个源操作数和1个目的操作数就需要20位指令编码。无法编码4个浮点寄存器类型的操作数时，体系结构使用累加乘指令，这样的指令只显式编码2个源操作数和1个目的操作数，我们用FPFMA Rd, RS1, Rs2表示，目的操作数Rd隐式的作为一个源操作数，完成的功能为Rd=(Rs1xRs2)+Rd，累加乘指令也完成了浮点乘加功能，但是属于破坏性指令，将乘加操作中的加数内容破坏了。假设要完成正常浮点乘加指令的功能(Rs1xRs2)+Rs3，用乘累加指令完成的话就是FPFMA Rs3, Rs1, Rs2，即Rs3=(Rs1xRs2)+Rs3，加数Rs3被改写了。When the processor architecture with a fixed instruction length expands new instructions, due to the increase in the number of instructions, more coded bits are needed as opcodes to represent the instruction function. After removing the opcode, the remaining bits in the instruction code can represent the operation. The number is limited, so some instructions may not be encoded properly. For example, under normal circumstances, a floating-point multiply-add instruction requires 3 source operands and 1 destination operand, we use FPMA Rd, Rs1, Rs2, Rs3 to represent, the function completed by this instruction is Rd=(Rs1xRs2)+Rs3 . The architecture usually defines 32 software-visible floating-point registers. A floating-point register operand requires 5 bits to represent, and 3 source operands and 1 destination operand require 20-bit instruction codes. When the operands of 4 floating-point register types cannot be encoded, the architecture uses the accumulate multiply instruction. Such an instruction only explicitly encodes 2 source operands and 1 destination operand. We use FPFMA Rd, RS1, Rs2 to represent, the purpose The operand Rd is implicitly used as a source operand, and the completed function is Rd=(Rs1xRs2)+Rd. The accumulative multiplication instruction also completes the floating-point multiplication and addition function, but it is a destructive instruction, and the addend in the multiplication and addition operation is The content is broken. Assuming that the function of the normal floating-point multiply-accumulate instruction (Rs1xRs2)+Rs3 is to be completed, it is FPFMA Rs3, Rs1, Rs2 if it is completed by the multiply-accumulate instruction, that is, Rs3=(Rs1xRs2)+Rs3, and the addend Rs3 is rewritten.

有些体系结构定义了一条前缀指令，用于和紧随其后的指令共同完成非破坏性的指令功能。为了便于描述，我们用PREFIX Rd, Rs表示前缀指令，其功能是把浮点寄存器Rs中的值写到Rd中，该指令可以和紧随其后的满足某些条件的指令融合成一条指令执行。比如，我们用如下所示的前缀指令和乘累加指令完成正常浮点乘加的功能：Some architectures define a prefix instruction, which is used to complete the non-destructive instruction function together with the instruction immediately following it. For the convenience of description, we use PREFIX Rd, Rs to represent the prefix instruction. Its function is to write the value in the floating-point register Rs to Rd. This instruction can be fused with the following instruction that meets certain conditions into one instruction for execution. . For example, we use the following prefix instructions and multiply-accumulate instructions to perform normal floating-point multiply-accumulate functions:

PREFIX Rd, Rs3PREFIX Rd, Rs3

FPFMA Rd, Rs1, Rs2FPFMA Rd, Rs1, Rs2

以上两条指令完成的功能即为：Rd=(Rs1xRs2)+Rs3。The function completed by the above two instructions is: Rd=(Rs1xRs2)+Rs3.

体系结构要求微处理器设计时，前缀指令和后面能够与其融合的指令（为了便于描述，我们将其称作被前缀指令）可以分别单独执行，也可以融合成一条指令执行。无论采用哪种方式执行，当被前缀指令发生异常时，必须保证能够实现精确异常，即如果前缀指令更新了体系结构状态，那么异常报告在被前缀指令处，异常返回地址为被前缀指令的地址，如果前缀指令没有更新体系结构状态，那么异常报告在前缀指令处，异常返回地址位前缀指令的地址。The architecture requires that when the microprocessor is designed, the prefix instruction and the subsequent instruction that can be fused with it (for the convenience of description, we call it the prefixed instruction) can be executed separately or fused into one instruction. No matter which method is used to execute, when an exception occurs in the prefixed instruction, it must ensure that the precise exception can be realized, that is, if the prefixed instruction updates the architectural state, then the exception is reported at the prefixed instruction, and the abnormal return address is the address of the prefixed instruction , if the prefix instruction does not update the architectural state, then the exception is reported at the prefix instruction, and the exception return address is the address of the prefix instruction.

综上可知，微处理器设计时，将前缀指令和被前缀指令分别单独执行比较容易实现，但是如果融合成一条指令执行，对处理器的性能和功耗会有益处，而如何保证设计满足架构要求是一个挑战。In summary, when designing a microprocessor, it is easier to execute the prefix instruction and the prefixed instruction separately, but if they are combined into one instruction, it will benefit the performance and power consumption of the processor. How to ensure that the design meets the requirements of the architecture? The requirements are a challenge.

发明内容Contents of the invention

本发明要解决的技术问题：针对现有技术的上述问题，提供一种指令融合的实现方法及装置，本发明能够有效实现指令融合，有利于提高处理器的性能、降低处理器的功耗，且设计简单，能够保证实现精确异常。The technical problem to be solved by the present invention: Aiming at the above-mentioned problems in the prior art, a method and device for realizing instruction fusion are provided. The present invention can effectively realize instruction fusion, which is conducive to improving the performance of the processor and reducing the power consumption of the processor. And the design is simple, which can guarantee the realization of precise exception.

为了解决上述技术问题，本发明采用的技术方案为：In order to solve the problems of the technologies described above, the technical solution adopted in the present invention is:

一种指令融合的实现方法，实施步骤包括：A method for realizing instruction fusion, the implementation steps comprising:

1）在取指阶段，取出指令；1) In the instruction fetch phase, the instruction is fetched;

2）在译码阶段，判断是否有满足融合条件的前缀指令和被前缀指令连续出现并同拍被译码，如果有，那么将前缀指令融合到被前缀指令上，形成一条指令，将该条指令标记为融合指令，前缀指令的源操作数变成融合指令的一个源操作数，替代隐式的源操作数；如果没有，则对指令进行正常译码；2) In the decoding stage, it is judged whether there are prefix instructions and prefixed instructions that meet the fusion conditions appearing consecutively and being decoded at the same time. If so, the prefix instruction is fused to the prefixed instruction to form an instruction, and The instruction is marked as a fused instruction, and the source operand of the prefix instruction becomes a source operand of the fused instruction, replacing the implicit source operand; if not, the instruction is decoded normally;

3）对指令进行重命名、分派和执行，如果指令是融合指令则将融合指令附加标记位，并将附加的标记位随流水线逐级传递；3) Rename, dispatch and execute the instruction. If the instruction is a fused instruction, add a flag bit to the fused instruction, and pass the additional flag bit along the pipeline step by step;

4）在提交阶段，判断附加标记位的融合指令是否发生异常，如果融合指令没有发生异常，那么提交融合指令并将提交地址前进两条指令，如果融合指令发生异常，融合指令不能提交，前缀指令和被前缀指令都没有更新体系结构状态，当前的提交地址停留在前缀指令处，所以在前缀指令处报告异常，异常返回地址为前缀指令的地址。4) In the submission stage, judge whether the fusion instruction with the additional flag bit is abnormal. If there is no abnormality in the fusion instruction, submit the fusion instruction and advance the submission address by two instructions. If the fusion instruction is abnormal, the fusion instruction cannot be submitted, and the prefix instruction Neither the prefix instruction nor the prefixed instruction update the architectural state, and the current commit address stays at the prefix instruction, so an exception is reported at the prefix instruction, and the exception return address is the address of the prefix instruction.

可选地，步骤2）的详细步骤包括：Optionally, the detailed steps of step 2) include:

2.1）判断同时译码的指令中是否包含前缀指令，如果包含则跳转执行步骤2.2），否则跳转执行步骤2.6）；2.1) Determine whether the prefix instruction is included in the instruction decoded at the same time, if it is included, skip to step 2.2); otherwise, skip to step 2.6);

2.2）判断该前缀指令是否是最后一条有效指令，如果不是则跳转执行步骤2.3），否则跳转执行步骤2.6）；2.2) Determine whether the prefix instruction is the last valid instruction, if not, jump to step 2.3), otherwise jump to step 2.6);

2.3）判断该前缀指令的下一条指令是否满足融合条件、能够和前缀指令融合，如果满足融合条件则跳转执行步骤2.4），否则跳转执行步骤2.6）；2.3) Determine whether the next instruction of the prefix instruction satisfies the fusion condition and can be fused with the prefix instruction. If the fusion condition is met, then jump to step 2.4), otherwise jump to step 2.6);

2.4）将该前缀指令融合到下一条指令中，形成一条指令，并将该指令标记为融合指令，然后跳转执行步骤2.5）；2.4) Fuse the prefix instruction into the next instruction to form an instruction, mark the instruction as a fusion instruction, and then jump to step 2.5);

2.5）前缀指令的源操作数变成融合指令的一个源操作数，取代隐式的源操作数，然后跳转执行步骤3）；2.5) The source operand of the prefix instruction becomes a source operand of the fusion instruction, replacing the implicit source operand, and then jumps to step 3);

2.6）进行正常译码，然后跳转执行步骤3）。2.6) Perform normal decoding, and then jump to step 3).

可选地，步骤4）的详细步骤包括：Optionally, the detailed steps of step 4) include:

4.1）根据随流水线逐级传递过来的标记位判断进入提交阶段的是否是融合指令，如果是跳转执行步骤4.2），否则跳转执行步骤4.5）；4.1) According to the flag bit passed through the pipeline step by step, it is judged whether the instruction entering the submission stage is a fusion instruction. If it is a jump, execute step 4.2); otherwise, skip and execute step 4.5);

4.2）判断融合指令是否发生了异常，如果发生了异常则跳转执行步骤4.3），否则跳转执行步骤4.4）；4.2) Judging whether an exception has occurred in the fusion instruction, if an exception occurs, jump to step 4.3), otherwise jump to step 4.4);

4.3）在前缀指令处报告异常，异常返回地址为前缀指令的地址；4.3) Report an exception at the prefix instruction, and the exception return address is the address of the prefix instruction;

4.4）提交融合指令，提交地址前进两条指令；4.4) Submit the fusion command, and submit the address to advance two commands;

4.5）判断非融合指令是否发生了异常，如果是则跳转执行步骤4.6），否则跳转执行步骤4.7）；4.5) Judging whether an exception has occurred in the non-fusion instruction, if so, jump to step 4.6), otherwise jump to step 4.7);

4.6）在该指令处报告异常，异常返回地址为该指令的地址；4.6) An exception is reported at the instruction, and the exception return address is the address of the instruction;

4.7）提交该指令，提交地址前进一条指令。4.7) Submit the instruction, and the submission address advances by one instruction.

此外，本发明还提供一种指令融合的实现装置，所述指令融合的实现装置被编程或配置以执行所述指令融合的实现方法的步骤。In addition, the present invention also provides a device for implementing instruction fusion, the device for implementing instruction fusion is programmed or configured to execute the steps of the method for implementing instruction fusion.

此外，本发明还提供一种微处理器，所述微处理器被编程或配置以执行所述指令融合的实现方法的步骤。In addition, the present invention also provides a microprocessor, which is programmed or configured to execute the steps of the implementation method of instruction fusion.

和现有技术相比，本发明具有下述优点：Compared with the prior art, the present invention has the following advantages:

1、本发明设计简单。本发明在译码阶段判断是否有满足融合条件的前缀指令和被前缀指令连续出现并同拍被译码，如果有，那么将前缀指令融合到被前缀指令上形成一条指令，将该条指令标记为融合指令，标记位随流水线逐级传递，在提交阶段根据标记位更新提交地址，流水线设计简单。1. The present invention is simple in design. In the decoding stage, the present invention judges whether there are prefix instructions and prefixed instructions that meet the fusion conditions appearing continuously and being decoded at the same time. If so, the prefix instruction is fused to the prefixed instruction to form an instruction, and the instruction is marked For fused instructions, the flag bits are passed along the pipeline step by step, and the commit address is updated according to the flag bits in the commit phase, so the pipeline design is simple.

2、本发明能够保证实现精确异常。本发明在发生指令融合时，融合指令如果正常提交，提交地址前进两条指令，融合指令如果发生异常，在前缀指令处报告异常，前缀指令没有更新体系结构状态，异常返回地址为前缀指令的地址，实现了精确异常。2. The present invention can guarantee the implementation of precise exceptions. In the present invention, when instruction fusion occurs, if the fusion instruction is submitted normally, the submission address advances two instructions, and if the fusion instruction is abnormal, an exception is reported at the prefix instruction, and the prefix instruction does not update the architectural state, and the abnormal return address is the address of the prefix instruction , which implements the exact exception.

附图说明Description of drawings

图1为本发明实施例方法的基本流程示意图。Fig. 1 is a schematic flow diagram of the basic process of the method of the embodiment of the present invention.

图2为本发明实施例的详细实施流程示意图。Fig. 2 is a schematic diagram of a detailed implementation process of an embodiment of the present invention.

具体实施方式Detailed ways

如图1所示，本实施例指令融合的实现方法的实施步骤包括：As shown in Figure 1, the implementation steps of the implementation method of instruction fusion in this embodiment include:

1）在取指阶段，取出指令；该步骤与现有技术相同；1) In the instruction fetching stage, the instruction is fetched; this step is the same as the prior art;

3）对指令进行重命名、分派和执行，如果指令是融合指令则将融合指令附加标记位，并将附加的标记位随流水线逐级传递；该步骤对于融合指令和普通指令没有区别；3) Rename, dispatch and execute the instruction. If the instruction is a fused instruction, add a flag bit to the fused instruction, and pass the additional flag bit along the pipeline step by step; this step has no difference between the fused instruction and the normal instruction;

4）在提交阶段，判断附加标记位的融合指令是否发生异常，如果融合指令没有发生异常，那么提交融合指令并将提交地址前进两条指令，如果融合指令发生异常，融合指令不能提交，前缀指令和被前缀指令都没有更新体系结构状态，当前的提交地址停留在前缀指令处，所以在前缀指令处报告异常，异常返回地址为前缀指令的地址，从而能够实现浮点精确异常，符合体系结构要求。4) In the submission stage, judge whether the fusion instruction with the additional flag bit is abnormal. If there is no abnormality in the fusion instruction, submit the fusion instruction and advance the submission address by two instructions. If the fusion instruction is abnormal, the fusion instruction cannot be submitted, and the prefix instruction Neither the prefixed instruction nor the prefixed instruction update the architectural state, and the current submission address stays at the prefixed instruction, so an exception is reported at the prefixed instruction, and the exception return address is the address of the prefixed instruction, so that floating-point accurate exceptions can be realized, which meets the requirements of the architecture .

本实施例步骤2）中，融合条件为判断是否有满足融合条件的前缀指令和被前缀指令连续出现并同拍被译码。因此，如果前缀指令是同时译码的有效指令中的最后一条，那么不会发生指令融合，前缀指令正常译码进入重命名流水站，不等待后续指令的到来，所以即便程序上紧随前缀指令之后的指令能够与其融合，但是由于没有和前缀指令同时达到译码阶段而不会发生融合。In step 2) of this embodiment, the fusion condition is to judge whether there are prefix instructions and prefixed instructions that meet the fusion condition appear consecutively and are decoded at the same time. Therefore, if the prefix instruction is the last one of the effective instructions decoded at the same time, then instruction fusion will not occur, and the prefix instruction will be decoded normally and enter the renaming station without waiting for the arrival of subsequent instructions, so even if the program follows the prefix instruction Subsequent instructions can be fused with it, but fusion will not occur because it does not reach the decoding stage at the same time as the prefix instruction.

本实施例中，步骤2）的详细步骤包括：In this embodiment, the detailed steps of step 2) include:

如图2所示，步骤4）的详细步骤包括：As shown in Figure 2, the detailed steps of step 4) include:

综上所述，本实施例指令融合的实现方法在译码阶段判断是否有满足融合条件的前缀指令和被前缀指令连续出现并同拍被译码，如果有，那么将前缀指令融合到被前缀指令上形成一条融合指令，前缀指令的源操作数变成融合指令的一个源操作数，替代隐式的源操作数，在提交阶段融合指令如果正常提交，提交地址前进两条指令，融合指令如果发生异常，在前缀指令处报告异常，前缀指令不更新体系结构状态，异常返回地址为前缀指令的地址。本发明设计简单，能够保证实现精确异常。To sum up, the implementation method of instruction fusion in this embodiment judges in the decoding stage whether there are prefix instructions and prefixed instructions that meet the fusion conditions appear consecutively and are decoded at the same time. If so, then fuse the prefix instruction to the prefixed instruction A fused instruction is formed on the instruction, and the source operand of the prefix instruction becomes a source operand of the fused instruction, replacing the implicit source operand. If the fused instruction is submitted normally during the commit phase, the commit address advances two instructions. If the fused instruction An exception occurs, the exception is reported at the prefix instruction, the prefix instruction does not update the architectural state, and the exception return address is the address of the prefix instruction. The invention has a simple design and can guarantee accurate abnormality.

此外，本实施例还提供一种指令融合的实现装置，该指令融合的实现装置被编程或配置以执行所述指令融合的实现方法的步骤。In addition, this embodiment also provides a device for implementing instruction fusion, which is programmed or configured to execute the steps of the method for implementing instruction fusion.

此外，本实施例还提供一种微处理器，该微处理器被编程或配置以执行所述指令融合的实现方法的步骤。In addition, this embodiment also provides a microprocessor, which is programmed or configured to execute the steps of the method for implementing instruction fusion.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质（包括但不限于磁盘存储器、CD-ROM、光学存储器等）上实施的计算机程序产品的形式。本申请是参照根据本申请实施例的方法、设备（系统）、和计算机程序产品的流程图和／或方框图来描述的。应理解可由计算机程序指令实现流程图和／或方框图中的每一流程和／或方框、以及流程图和／或方框图中的流程和／或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和／或方框图一个方框或多个方框中指定的功能的装置。这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和／或方框图一个方框或多个方框中指定的功能。这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和／或方框图一个方框或多个方框中指定的功能的步骤。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and combinations of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a Means for realizing the functions specified in one or more steps of the flowchart and/or one or more blocks of the block diagram. These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram. These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart flow or flows and/or block diagram block or blocks.

以上所述仅是本发明的优选实施方式，本发明的保护范围并不仅局限于上述实施例，凡属于本发明思路下的技术方案均属于本发明的保护范围。应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理前提下的若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。The above descriptions are only preferred implementations of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions under the idea of the present invention belong to the protection scope of the present invention. It should be pointed out that for those skilled in the art, some improvements and modifications without departing from the principles of the present invention should also be regarded as the protection scope of the present invention.

Claims

1. A method for realizing instruction fusion, characterized in that the implementation steps include:

1) In the instruction fetch phase, the instruction is fetched;

2) In the decoding stage, it is judged whether there are prefix instructions and prefixed instructions that meet the fusion conditions appearing consecutively and being decoded at the same time. If so, the prefix instruction is fused to the prefixed instruction to form an instruction, and The instruction is marked as a fused instruction, and the source operand of the prefix instruction becomes a source operand of the fused instruction, replacing the implicit source operand; if not, the instruction is decoded normally;

3) Rename, dispatch and execute the instruction. If the instruction is a fused instruction, add a flag bit to the fused instruction, and pass the additional flag bit along the pipeline step by step;

4) In the submission stage, judge whether the fusion instruction with the additional flag bit is abnormal. If there is no abnormality in the fusion instruction, submit the fusion instruction and advance the submission address by two instructions. If the fusion instruction is abnormal, the fusion instruction cannot be submitted, and the prefix instruction The architectural state is not updated with the prefixed instruction, and the current submission address stays at the prefixed instruction, so an exception is reported at the prefixed instruction, and the exception return address is the address of the prefixed instruction;

The detailed steps of step 2) include:

2.1) Determine whether the prefix instruction is included in the instruction decoded at the same time, if it is included, skip to step 2.2); otherwise, skip to step 2.6);

2.2) Determine whether the prefix instruction is the last valid instruction, if not, jump to step 2.3), otherwise jump to step 2.6);

2.3) Determine whether the next instruction of the prefix instruction satisfies the fusion condition and can be fused with the prefix instruction. If the fusion condition is met, then jump to step 2.4), otherwise jump to step 2.6);

2.4) Fuse the prefix instruction into the next instruction to form an instruction, mark the instruction as a fusion instruction, and then jump to step 2.5);

2.5) The source operand of the prefix instruction becomes a source operand of the fusion instruction, replacing the implicit source operand, and then jumps to step 3);

2.6) Perform normal decoding, and then jump to step 3).

2. The method for realizing instruction fusion according to claim 1, wherein the detailed steps of step 4) include:

4.1) According to the flag bit passed through the pipeline step by step, it is judged whether the instruction entering the submission stage is a fusion instruction. If it is a jump, execute step 4.2); otherwise, skip and execute step 4.5);

4.2) Judging whether an exception has occurred in the fusion instruction, if an exception occurs, jump to step 4.3), otherwise jump to step 4.4);

4.3) Report an exception at the prefix instruction, and the exception return address is the address of the prefix instruction;

4.4) Submit the fusion command, and submit the address to advance two commands;

4.5) Judging whether an exception has occurred in the non-fusion instruction, if so, jump to step 4.6), otherwise jump to step 4.7);

4.6) An exception is reported at the instruction, and the exception return address is the address of the instruction;

4.7) Submit the instruction, and the submission address advances by one instruction.

3. A device for implementing instruction fusion, characterized in that the device for implementing instruction fusion is programmed or configured to execute the steps of the method for implementing instruction fusion according to claim 1 or 2.

4. A microprocessor, characterized in that the microprocessor is programmed or configured to execute the steps of the method for implementing instruction fusion as claimed in claim 1 or 2.