CN112181491B - Processor and return address processing method - Google Patents
Processor and return address processing method Download PDFInfo
- Publication number
- CN112181491B CN112181491B CN201910586325.5A CN201910586325A CN112181491B CN 112181491 B CN112181491 B CN 112181491B CN 201910586325 A CN201910586325 A CN 201910586325A CN 112181491 B CN112181491 B CN 112181491B
- Authority
- CN
- China
- Prior art keywords
- return address
- conversion
- register
- processor
- converted
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title abstract description 9
- 238000006243 chemical reaction Methods 0.000 claims abstract description 319
- 238000000034 method Methods 0.000 claims abstract description 76
- 238000012545 processing Methods 0.000 claims description 150
- 238000013519 translation Methods 0.000 claims description 89
- 230000008569 process Effects 0.000 abstract description 30
- 238000010586 diagram Methods 0.000 description 16
- 238000004590 computer program Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 13
- 238000005516 engineering process Methods 0.000 description 6
- 238000005336 cracking Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 208000024780 Urticaria Diseases 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000007123 defense Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/3012—Organisation of register space, e.g. banked or distributed register file
- G06F9/3013—Organisation of register space, e.g. banked or distributed register file according to data content, e.g. floating-point registers, address registers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/3017—Runtime instruction translation, e.g. macros
- G06F9/30178—Runtime instruction translation, e.g. macros of compressed or encrypted instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/468—Specific access rights for resources, e.g. using capability register
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Storage Device Security (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The embodiment of the application provides a processor and a processing method of a return address, wherein a conversion circuit of hardware is arranged in the processor, when the return address needs to be stored, the conversion circuit is utilized to convert the return address, and the obtained converted return address is output to a memory; when the return address is needed, the conversion circuit is used for converting the converted return address in the memory to obtain the return address. Because the attacker cannot know the conversion operation performed in the conversion circuit, the attacker cannot modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, thereby preventing the attacker from maliciously modifying the program control flow. In addition, the conversion process is realized through a conversion circuit of hardware in the program running process, so that the calling instruction and the returning instruction are not required to be identified in the compiling stage, and additional encryption instructions and decryption instructions are not required to be inserted, thereby avoiding influencing the running performance of the processor.
Description
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a processor and a return address processing method.
Background
During the process of running the program by the processor, an attacker may have an event of maliciously hijacking the program control flow. Specifically, an attacker modifies a normal return address into a malicious return address by modifying the return address of the subroutine, so that the processor jumps to a code segment pointed by the malicious return address after executing the subroutine code, thereby achieving the purpose of changing the control flow of the program, and further destroying the control flow integrity (Control Flow Integrity, CFI) of the program.
Currently, in order to defend against malicious changes to a program control flow, it is generally necessary to monitor the control flow while the program is running, and if the program control flow is changed, an alarm is issued. In the related art, at the program compiling stage, a call instruction and a return instruction corresponding to a subroutine are identified, an encryption instruction is inserted before the call instruction, and a decryption instruction is inserted before the return instruction. Further, in the program running stage, before the processor calls the subprogram, the return address of the subprogram is encrypted by using the encryption instruction, and the obtained encrypted address is stacked. After the execution of the subroutine is completed, the decryption instruction is used for decrypting the popped encrypted address to obtain the original return address, so that the processor can continue to execute from the return address.
After the defense technology is adopted, even if an attacker hives the encrypted address from the stack, the attacker cannot tamper the encrypted address into an encrypted malicious address because the attacker does not know the key adopted by the encrypted instruction. That is, the attacker cannot control the jump position of the program after hijacking the program control flow, so that the attacker can be prevented from maliciously changing the program control flow, and the integrity of the program control flow is protected.
However, in the above-described technique, a plurality of additional instructions need to be inserted into the program, so that the running performance of the program is lowered.
Disclosure of Invention
The embodiment of the application provides a processor and a return address processing method, which protect the control flow integrity of a program on the basis of not reducing the running performance of the program.
In a first aspect, an embodiment of the present application provides a processor, including: a processing core and a conversion circuit;
The processing core is used for outputting a return address;
The conversion circuit is used for converting the return address output by the processing core to obtain a converted return address and outputting the converted return address to a stack in a memory;
The translation circuit is further configured to, when the processing core needs to use the return address, perform the translation on the translated return address in the stack to obtain the return address, and output the return address to the processing core.
In this embodiment, the translation circuit translates the return address once before the return address is stored in the memory, and therefore the translated return address is stored in the memory. Because the attacker cannot know the conversion operation performed in the conversion circuit, the attacker cannot modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, thereby preventing the attacker from maliciously modifying the program control flow. After the converted return address is popped from the memory, the conversion circuit converts the converted return address again to obtain the original return address, so that the processing core can execute subsequent instructions according to the original return address, and the integrity of the program control flow is ensured. The conversion process is realized through a hardware conversion circuit in the program operation process, so that the calling instruction and the returning instruction are not required to be identified in the compiling stage, and additional encryption instructions and decryption instructions are not required to be inserted, thereby avoiding influencing the operation performance of the processor. Meanwhile, the risk of software stealing is avoided.
Optionally, the processor further includes a register;
the conversion circuit is specifically configured to convert a return address output by the processing core to obtain a converted return address, and output the converted return address to the register, so that the converted return address is output to a stack in a memory through the register;
The translation circuit is further specifically configured to, when the processing core needs to use the return address, perform the translation on the translated return address in the stack to obtain the return address, and output the return address to the register, so that the return address is output to the processing core through the register.
In this embodiment, a hardware conversion circuit is disposed on the write path of the register, and the conversion circuit is used to convert the return address entering the register, that is, all the inputs that need to enter the register will be converted by the conversion circuit first, and then the conversion result is input into the register. By the method, the return address can be automatically identified, the control flow of the processor is not required to be changed, and the method is easy to implement.
Optionally, the processor further includes a register;
the register is used for registering the return address output by the processing core when the processing core outputs the return address;
The conversion circuit is specifically configured to convert the return address output by the register to obtain a converted return address, and output the converted return address to a stack in a memory;
The register is further configured to register the translated return address of the stack output when the processing core needs to use the return address;
The conversion circuit is further specifically configured to perform the conversion on the converted return address output by the register, obtain the return address, and output the return address to the processing core.
In this embodiment, a hardware conversion circuit is provided on the read path of the register, and the conversion circuit is used to convert the return address output from the register, i.e., all the values output from the register are converted by the conversion circuit. By the method, the return address can be automatically identified, the control flow of the processor is not required to be changed, and the method is easy to implement.
Optionally, the conversion satisfies the following condition:
B=IP(A),A=IP(B)
wherein a is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
Optionally, the conversion circuit is specifically configured to:
and converting at least one bit of the return address by adopting at least one conversion model to obtain the converted return address.
Wherein the at least one bit comprises a bit corresponding to a changed section of a code address of the program and a bit corresponding to a constant section; the code address of the program comprises instruction addresses of a plurality of instructions, the unchanged segment is a bit with the same bit in the instruction addresses, and the changed segment is a bit with different bits in the instruction addresses.
By storing the conversion model and the conversion parameters in the hardware circuit, an attacker cannot acquire the sensitive information, so that the defending reliability of the program control flow is improved; when the conversion circuit converts at least one bit of the return address, the change section and the unchanged section are converted simultaneously, so that the brute force cracking difficulty of an attacker can be improved, and the safety of a program control flow is ensured.
Optionally, the conversion circuit is specifically configured to:
grouping at least one bit of the return address to obtain a plurality of bit groups;
Converting the bits in each bit group by adopting the conversion model to obtain a conversion result corresponding to each bit group, wherein the conversion models adopted by at least two bit groups in the plurality of bit groups are different or the conversion models adopted by the plurality of bit groups are the same;
and obtaining the conversion return address according to the conversion result corresponding to each bit group.
Optionally, the number of the bit groups is two, wherein one bit group includes bits corresponding to odd bits of the return address, and the other bit group includes bits corresponding to even bits of the return address.
Optionally, the types of the conversion model include: a modular multiplication conversion model and a modular addition conversion model.
Optionally, the register is a register for storing a return address.
Optionally, the processor is an ARM instruction set based processor, and the register is an LR register.
Optionally, the processor is a RISC V instruction set based processor, and the register is an RA register.
In a second aspect, an embodiment of the present application provides a method for processing a return address, which is applied to a processor, where the processor includes: a processing core and a translation circuit, the method comprising:
Converting the return address by the conversion circuit to obtain a converted return address and outputting the converted return address to a stack in a memory when the processing core outputs the return address;
when the return address is needed to be used, the conversion circuit is used for carrying out the conversion on the conversion return address in the stack, obtaining the return address, and outputting the return address to the processing core.
Optionally, the processor further includes a register, and the converting, by the converting circuit, the return address to obtain a converted return address, and outputting the converted return address to a stack in a memory, including:
converting the return address by the converting circuit to obtain a converted return address, and outputting the converted return address to the register so that the converted return address is output to a stack in a memory via the register;
said converting, by said converting circuit, said translated return address in said stack to said return address and outputting said return address to said processing core, comprising:
The translation is performed on the translated return address in the stack by the translation circuit to obtain the return address, and the return address is output to the register to cause the return address to be output to the processing core via the register.
Optionally, the processor further includes a register, the return address is output to the register by the processing core, the converting the return address by the converting circuit to obtain a converted return address, and outputting the converted return address to a stack in a memory, including:
Converting the return address output by the register by the conversion circuit to obtain a converted return address, and outputting the converted return address to a stack in a memory;
said converting, by said converting circuit, said translated return address in said stack to said return address and outputting said return address to said processing core, comprising:
And converting the converted return address output by the register through the conversion circuit to obtain the return address, and outputting the return address to the processing core.
Optionally, the conversion satisfies the following condition:
B=IP(A),A=IP(B)
wherein a is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
Optionally, the translating the return address to obtain a translated return address includes:
and converting at least one bit of the return address by adopting at least one conversion model to obtain the converted return address.
Optionally, the converting at least one bit of the return address using at least one conversion model to obtain the converted return address includes:
grouping at least one bit of the return address to obtain a plurality of bit groups;
Converting the bits in each bit group by adopting the conversion model to obtain a conversion result corresponding to each bit group, wherein the conversion models adopted by at least two bit groups in the plurality of bit groups are different or the conversion models adopted by the plurality of bit groups are the same;
and obtaining the conversion return address according to the conversion result corresponding to each bit group.
Optionally, the number of the bit groups is two, wherein one bit group includes bits corresponding to odd bits of the return address, and the other bit group includes bits corresponding to even bits of the return address.
Optionally, the types of the conversion model include: a modular multiplication conversion model and a modular addition conversion model.
Optionally, the register is a register for storing a return address.
Optionally, the processor is an ARM instruction set based processor, and the register is an LR register.
Optionally, the processor is a RISC V instruction set based processor, and the register is an RA register.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor as claimed in any one of the first aspects.
In a fourth aspect, an embodiment of the present application provides a chip, including: a processor as claimed in any one of the first aspects.
The embodiment of the application provides a processor and a return address processing method, wherein a conversion circuit of hardware is arranged in the processor, when the return address needs to be stored, the conversion circuit is utilized to convert the return address, and the obtained converted return address is output to a memory; when the return address is needed, the conversion circuit is used for converting the converted return address in the memory to obtain the return address. In the embodiment of the application, the attacker cannot know the conversion operation performed in the conversion circuit, so that the attacker cannot modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, thereby preventing the attacker from maliciously modifying the program control flow. In addition, the conversion process is realized through a hardware conversion circuit in the program running process, so that compared with the related technology, the method and the device have the advantages that the calling instruction and the returning instruction are not required to be identified in the compiling stage, and additional encryption instructions and decryption instructions are not required to be inserted, so that the influence on the running performance of the processor is avoided.
Drawings
Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a processor according to an embodiment of the present application;
FIG. 3 is a schematic diagram illustrating a processing procedure of a return address according to an embodiment of the present application;
FIGS. 4A and 4B are schematic diagrams illustrating the processing of a prior art return address;
FIGS. 5A and 5B are schematic diagrams illustrating the processing of a return address according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a program running process according to an embodiment of the present application;
FIGS. 7A and 7B are schematic diagrams illustrating the processing of a return address according to an embodiment of the present application;
Fig. 8 is a flow chart of a processing method of a return address according to an embodiment of the present application.
Detailed Description
For the convenience of understanding the present application, first, the structure of an electronic device to which the processor of the present application is applied will be described with reference to fig. 1.
Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 1, the electronic device 10 includes a processor 100 and a memory (memory) 200. Wherein the memory 200 is used for storing computer programs and data. The memory is classified into a main memory and an auxiliary memory according to its use. Main memory is also known as "internal memory," simply "memory," which is used to temporarily store computer programs and data during the operation of the processor. The auxiliary memory is also called "external memory", simply "external memory", which is used to store computer programs and data that are temporarily unused during the operation of the processor. The processor 100 is arranged to execute a computer program stored in the memory 200.
Wherein the processor is a core device of the electronic device. The processor typically includes at least one processing core, a cache, and an input-output interface to communicate with other devices of the electronic device. Wherein a processing core refers to a processing unit in a processor for performing data processing tasks. The processing core is the main device in the processor responsible for the operation.
The processor in the present application may be a central processing unit (central processing unit, CPU), a graphics processor (graphics processing unit, GPU), or the like.
The processor may be used to execute a computer program in the electronic device. The processor is capable of recognizing and executing instructions in a computer program to cause the electronic device to perform a certain function or to obtain a certain result.
A computer program is a sequence of instructions consisting of a plurality of instructions. A subroutine may also be called in the computer program. A subroutine may also be referred to as a "subprocess," or "subroutine". A subroutine is made up of one or more instructions responsible for performing a particular task, with relative independence. In general, a program including a calling subroutine is called a main program. The main program and the subprogram are relative, for example: program A calls program B, which in turn calls program C. Then, the program B is a subroutine with respect to the program a, and the program B is a main program with respect to the program C.
The normal execution flow of a computer program is called the program control flow. The processor executes the instructions according to the program control flow during execution of the computer program. However, during the execution of a computer program, the processor may be subject to the occurrence of an attacker maliciously hijacking the program control flow. The purpose of these malicious attack events is typically to alter the control flow of the program, thereby disrupting the control flow integrity (Control Flow Integrity, CFI) of the program. Illustratively, one common malicious attack event that destroys the CFI of a program is a return address oriented programming (return oriented programming, ROP) attack.
The process by which the processor executes the computer program under normal conditions and the process by which the processor executes the computer program under ROP attacks are described below in connection with a specific example.
It is assumed that the computer program comprises: instruction a, instruction B, and instruction C. Wherein instruction B is a subroutine call instruction. Wherein, the address of the instruction A is 0x0000, the address of the instruction B is 0x0004, and the address of the instruction C is 0x0008. The normal program execution flow is to execute instruction a, instruction B and instruction C in sequence. The normal flow of the processor executing the program is as follows.
1) The processor executes instruction a.
Because the instruction B is a subroutine call instruction, the processor jumps to the address of the subroutine corresponding to the instruction B when executing the instruction B. In order to ensure that the processor can correctly return to the address of the instruction C after executing the subroutine corresponding to the instruction B, the processor stores the address of the instruction C before executing the instruction B. That is to say,
2) The processor writes the address of instruction C into memory. In the embodiment of the present application, the address of the instruction C is referred to as a return address.
Illustratively, the processor may write the return address to a stack in memory.
3) And executing the instruction B by the processor, jumping to the address of the subprogram corresponding to the instruction B, and executing the subprogram.
4) When the execution of the subroutine is completed, the return address (0 x 0008) is read from the memory, and the jump is made to address 0x0008 to execute instruction C.
When ROP attack exists, an attacker falsifies the return address stored in the memory. Illustratively, the attacker modifies the address (0 x 0008) of instruction C stored in memory to a malicious address by remote software. After the subprogram corresponding to the instruction B is executed, the processor reads the malicious address from the memory and jumps to the malicious address for execution. Thus, the purpose of destroying the integrity of the program control flow is achieved.
Currently, in order to prevent a program control flow from being maliciously changed, it is generally necessary to monitor the control flow during the running of the program, and if the program control flow is changed, an alarm is issued.
In a related art, a software mode is adopted to defend against malicious changes of a program control flow. Specifically, in the program compiling stage, first, a call instruction and a return instruction corresponding to the subroutine are identified. Illustratively, the call instruction is a call instruction and the return instruction is a ret instruction. Then, an encryption instruction is inserted before the call instruction. Illustratively, the encryption instruction is an instruction that encrypts the return address with a preset key. At the same time, a decryption instruction is inserted before the return instruction. Illustratively, the decryption instruction is an instruction that decrypts the encrypted return address using the same key.
After compiling, before the processor calls the subprogram, the encrypting instruction is executed to encrypt the return address of the subprogram, and the obtained encrypting address is stored in the memory. After the execution of the subroutine is completed, the encryption address read from the memory is decrypted by the decryption instruction to obtain the original return address, so that the processor can continue to execute from the return address.
The specific encryption and decryption modes can be defined as a special register in the processor, which is specially used for storing the secret key and cannot be used for other purposes. When encrypting, the original return address and the secret key in the special register are exclusive-or operated to obtain the encrypted return address, and the encrypted return address is stored in the memory. And when decryption is carried out, carrying out exclusive OR operation on the encrypted return address read from the memory and the secret key in the special register again to obtain the original return address. That is, the encryption instruction and the decryption instruction are subjected to exclusive OR operation.
After the defense technology is adopted, even if an attacker hives the encrypted return address from the memory, the attacker cannot tamper the encrypted return address into an encrypted malicious address because the attacker does not know the key adopted by the encrypted instruction. That is, the attacker cannot control the jump position of the program after hijacking the program control flow, so that the attacker can be prevented from maliciously changing the program control flow, and the integrity of the program control flow is protected.
However, in the above-mentioned related art, there are at least the following problems: 1) In the compiling stage of the program, a calling instruction and a returning instruction need to be identified, and a plurality of additional encryption instructions and decryption instructions are inserted into the program, so that the running performance of the program is reduced. 2) Because the secret key is stored in the register and encrypted and decrypted in a software mode, the risk of software stealing exists, and the protection security is poor. 3) And each bit is independently operated by exclusive OR operation, so that the brute force cracking difficulty is low, and each time of brute force cracking, the program can jump to a code area, and potential safety hazards exist. 4) Since a special register is required to store the key, the special register cannot be used by other functions, so that the application scenario is limited.
In order to solve at least one of the above problems, an embodiment of the present application provides a processor, in which a conversion circuit of hardware is provided, when a return address needs to be saved, the conversion circuit is used to convert the return address, and the obtained converted return address is output to a memory; when the return address is needed, the conversion circuit is used for converting the converted return address in the memory to obtain the return address. In the embodiment of the application, the attacker cannot know the conversion operation performed in the conversion circuit, so that the attacker cannot modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, thereby preventing the attacker from maliciously modifying the program control flow. In addition, the conversion process is realized through a hardware conversion circuit in the program running process, so that compared with the related technology, the method and the device have the advantages that the calling instruction and the returning instruction are not required to be identified in the compiling stage, and additional encryption instructions and decryption instructions are not required to be inserted, so that the influence on the running performance of the processor is avoided.
The following describes the technical scheme shown in the present application in detail through specific embodiments. It should be noted that the following embodiments may exist alone or in combination with each other, and for the same or similar content, the description will not be repeated in different embodiments.
Fig. 2 is a schematic structural diagram of a processor according to an embodiment of the present application. As shown in fig. 2, the processor 100 of the present embodiment includes: a processing core 110 and a translation circuit 120.
Wherein the processing core 110 is configured to output a return address.
Translation circuitry 120 is used to translate the return address output by processing core 110 to obtain a translated return address and output the translated return address to a stack in memory.
The translation circuit 120 is further configured to, when the processing core 110 needs to use the return address, perform the translation on the translated return address in the stack to obtain the return address, and output the return address to the processing core 110.
The embodiment of the application is not particularly limited to the type of the processor. The processor in the embodiments of the present application may be, but is not limited to, the following types: ARM instruction set based processors, RISC-V instruction set based processors.
The bit width of the processor is not particularly limited in the embodiment of the application. The processor may be a 32-bit processor, or may be a 64-bit processor, although other bit-width processors are also possible.
Wherein the number of processing cores in the processor may be one or more. A processing core refers to a processing unit in a processor that is used to perform data processing tasks. When the number of processing cores is one, the processor is a single-core processor. When the number of processing cores is plural, the processor is a multicore processor. The processing core is to output a return address. The return address is the address of the next instruction to be executed by the processing core.
In the present application, a processing core is any circuit that needs to store a return address into memory and that needs to fetch a return address from memory. Illustratively, the processing core is a Program Counter (PC). The PC may also be referred to as an instruction counter. The PC is used for storing the address of the next instruction to be executed of the processor. Before the program starts executing, the processor feeds the start address of the program, i.e. the address of the first instruction of the program, into the PC. When executing an instruction, the processor will automatically modify the value in the PC, i.e., per instruction executed, increment the value in the PC by an amount that always points the value in its PC to the address of the next instruction to be executed.
In the application, the conversion circuit is arranged on a data path between the processing core and the memory. The processing of the return address by the translation circuit is described below in connection with fig. 3.
FIG. 3 is a schematic diagram illustrating a processing procedure of a return address according to an embodiment of the present application. As shown in fig. 3, when the processing core needs to store the return address into the memory, the return address output by the processing core passes through the translation circuit, the translation circuit translates the return address to obtain a translated return address, and then outputs the translated return address to the stack in the memory. When the processing core needs to use the return address, the converted return address popped from the memory passes through a conversion circuit, the conversion circuit performs the same conversion on the converted address to obtain an original return address, and then the original return address is output to the processing core.
The processing core needs to store the return address into the memory, which may mean that the processing core needs to store the return address corresponding to the call instruction into the memory before executing the call instruction.
Accordingly, the processing core needs to use the return address, which may mean that when the processing core returns from the subroutine corresponding to the call instruction, the return address corresponding to the call instruction needs to be obtained from the memory.
It will be appreciated that in embodiments of the present application, the translation circuitry may translate the return address using one or more translation models. The present embodiment is not particularly limited thereto, and several possible conversion modes may be found in the detailed description of the embodiments that follow.
In the application, the conversion circuit converts the return address once before the return address is stored in the memory, so that the converted return address is stored in the memory. Because the attacker cannot know the conversion operation performed in the conversion circuit, the attacker cannot modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, thereby preventing the attacker from maliciously modifying the program control flow.
After the converted return address is popped from the memory, the conversion circuit converts the converted return address again to obtain the original return address, so that the processing core can execute subsequent instructions according to the original return address, and the integrity of the program control flow is ensured.
In the application, the conversion process is realized by a conversion circuit of hardware in the program running process, compared with the related technology, the application does not need to identify the calling instruction and the returning instruction in the compiling stage, does not need to insert additional encryption instructions and decryption instructions, and avoids influencing the running performance of the processor. Meanwhile, compared with the related technology, a special register is not needed to store the secret key, so that the risk of software stealing is avoided.
In a possible embodiment, the processor further includes a control circuit, where the control circuit is capable of identifying whether the address output by the processing core is a return address, and when the control circuit identifies that the address output by the processing core is a return address, the control circuit controls the return address to be input into the conversion circuit. The translation circuit translates the return address to obtain a translated return address and outputs the translated return address to a stack in memory. When the control circuit recognizes that the processing core needs to use the return address, the translated return address popped from the memory is input to the translation circuit under control of the control circuit. The translation circuit translates the translated return address to obtain an original return address and outputs the original return address to the processing core.
In another possible implementation, the processor does not perceive the presence of the conversion circuit. The translation circuitry is configured to pass through the translation circuitry on the data path between the processing core and the memory, i.e., during the transfer of the return address from the processing core to the memory and from the memory to the processing core. The embodiment only needs to set a conversion circuit on a data path between the processing core and the memory, does not need to change the existing control flow of the processor, and is easy to implement.
Optionally, the processor further includes a register, where the register is a register dedicated to storing the return address.
For this type of processor, the registers are the requisite path for return addresses from the processing core to the memory. Illustratively, when a return address for a subroutine is required to be output to memory, the processing core first outputs the return address to the register described above, and then the register outputs the return address to memory. When the return address is needed, the return address popped from the memory is also output to the register, and then the register outputs the return address to the processing core.
It will be appreciated that the registers used to store the return address may be different for processors of different instruction sets.
Illustratively, the processor is an ARM instruction set based processor, and the registers are link registers (LINK REGISTER, LR).
Illustratively, the processor is a RISC V instruction set based processor, and the register is a Return Address (RA) register.
The structure of the processor and the processing of the return address are described below using a processor of the ARM instruction set as an example. Fig. 4A and 4B are schematic diagrams of a conventional return address processing procedure. Wherein fig. 4A illustrates a process diagram of return address stacking, and fig. 4B illustrates a process diagram of return address popping.
Before the processing core executes the call instruction, the return address corresponding to the call instruction needs to be stored into the memory. Illustratively, as shown in FIG. 4A, the return address output by the PC enters the LR register under control of the processing core. The return address output by the LR register is then stored in the stack in memory.
When the processing core returns from the subroutine corresponding to the call instruction, the return address needs to be read from the memory. Illustratively, as shown in FIG. 4B, the return address that is popped from memory enters the LR register. The LR register then outputs the return address to the processing core.
The conversion circuit in this embodiment may be provided before the register or after the register.
In this embodiment, the conversion circuit is disposed before the register, which means that the conversion circuit is disposed on the write path of the register, that is, the input terminal of the register. When the translation circuit is placed before the register, this means that all return addresses entering the register will pass through the translation circuit before entering the register. In this embodiment, the arrangement of the conversion circuit after the register means that the conversion circuit is arranged on the read-out path of the register, i.e. at the output of the register. When the translation circuit is placed after a register, this means that all return addresses output from that register will pass through the translation circuit.
The two embodiments described above are described below, respectively.
Fig. 5A and fig. 5B are schematic diagrams illustrating a processing procedure of a return address according to an embodiment of the present application. In this embodiment, the conversion circuit is provided on the write path of the LR register. Wherein fig. 5A illustrates a return address stacking process, and fig. 5B illustrates a return address popping process.
The conversion circuit is specifically configured to convert a return address output by the processing core to obtain a converted return address, and output the converted return address to the register, so that the converted return address is output to a stack in the memory through the register.
The translation circuit is further specifically configured to, when the processing core needs to use the return address, perform the translation on the translated return address in the stack to obtain the return address, and output the return address to the register, so that the return address is output to the processing core through the register.
As shown in fig. 5A, since the conversion circuit is disposed on the write path of the register, the return address output from the processing core passes through the conversion circuit before being output to the LR register. The translation circuit translates the return address output by the processing core to obtain a translated return address such that what is actually stored in the LR register is the translated return address. The LR register then outputs the translated return address to the stack in memory, i.e., what is actually pushed onto the stack is the translated return address. Because the attacker cannot know the conversion operation performed in the conversion circuit, the attacker cannot modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, thereby preventing the attacker from maliciously modifying the program control flow.
When the processing core needs to use the return address, as shown in fig. 5B, the translated return address from the memory stack passes through the translation circuit before entering the LR register, since the translation circuit is disposed on the write path of the LR register. The translation circuit translates the translated return address to the original return address such that the original return address is actually stored in the LR register. The LR register then outputs the original return address to the processing core. Thereby ensuring the normal control flow of the program.
The following will take an example of a processor based on the ARM instruction set, and illustrate the operation of a section of actual program.
Fig. 6 is a schematic diagram of a program running process according to an embodiment of the present application. As shown in fig. 6, the program includes the following ARM instructions: SUB, STP, ADD, LDP, ADD, and RET.
The control flow is ready to enter a subroutine, i.e., the processing core outputs a return address to the LR (0 x 30) register when executing STP instructions. Before the return address enters the LR register, the return address is passed through a translation circuit which translates the return address to a translated return address so that it is the translated return address that actually enters the LR register. The LR register then outputs the translated return address to the stack in memory.
When the control flow returns from the end of the subroutine, i.e. when the processing core executes the LDP instruction, the translated return address is read from the stack of the memory. The translated return address passes through a translation circuit that translates the translated return address again to obtain the original return address before entering the LR (0 x 30) register. The original return address is stored in the LR register and used by the processing core when executing RET instructions.
In this embodiment, a hardware conversion circuit is disposed on the write path of the LR register, and the conversion circuit is used to convert the return address entering the LR register, that is, all the inputs required to enter the LR register will be converted by the conversion circuit, and then the conversion result will be input into the LR register. By the method, the return address can be automatically identified, the control flow of the processor is not required to be changed, and the method is easy to implement.
Fig. 7A and fig. 7B are schematic diagrams illustrating a processing procedure of a return address according to an embodiment of the present application. In this embodiment, the conversion circuit is provided on the readout path of the LR register. Wherein fig. 7A illustrates a return address stacking process, and fig. 7B illustrates a return address popping process.
The register is used for registering the return address output by the processing core when the processing core outputs the return address. The conversion circuit is specifically configured to convert the return address output by the register to obtain a converted return address, and output the converted return address to a stack in a memory.
The register is also used to register the translated return address of the stack output when the processing core needs to use the return address. The conversion circuit is further specifically configured to perform the conversion on the converted return address output by the register, obtain the return address, and output the return address to the processing core.
As shown in fig. 7A, the return address output by the processing core is output to the LR register. Since the translation circuit is provided on the readout path of the LR register, the return address passes through the translation circuit during the LR register outputting the return address to the stack in memory. The conversion circuit converts the return address output by the LR register to obtain a converted return address, so that the converted return address is actually stacked. Because the attacker cannot know the conversion operation performed in the conversion circuit, the attacker cannot modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, thereby preventing the attacker from maliciously modifying the program control flow.
When the processing core needs to use the return address, the translated return address popped from memory enters the LR register, as shown in fig. 7B. Since the conversion circuit is provided on the readout path of the LR register, the converted return address passes through the conversion circuit during the process of outputting the converted return address to the processing core by the LR register. The translation circuitry translates the translated return address to the original return address such that the actual input to the processing core is the original return address. Thereby ensuring the normal control flow of the program.
In this embodiment, a hardware conversion circuit is provided on the readout path of the LR register, and the conversion circuit is used to convert the return address output from the LR register, that is, all the values output from the LR register are first converted by the conversion circuit. By the method, the return address can be automatically identified, the control flow of the processor is not required to be changed, and the method is easy to implement.
It should be noted that, in the above embodiment, the processor based on the ARM instruction set is described as an example. The processor to which embodiments of the present application are applicable is not limited in this regard. The embodiments of the present application are equally applicable to processors of other instruction sets, as long as there are registers in the processor that are dedicated to storing return addresses.
It should be noted that in the above embodiments, for both ARM instruction set based processors and RISC-V instruction set based processors, if the compiler supports the end branch (leaf subroutine) optimization option (i.e., the return address is not saved from LR/RA register to stack in memory in leaf subroutine, but is saved all the way to LR/RA register), this optimization function needs to be turned off, ensuring that LR/RA is used in leaf subroutine in no way different from the general branch (subroutine). Otherwise, the return address, when read from the LR/RA register, is not decrypted, which may affect the normal return of the subroutine.
The following describes a specific conversion process of the conversion circuit in each of the above embodiments.
In the embodiment of the application, the conversion of the return address by the conversion circuit meets the following conditions:
B=IP(A),A=IP(B)
wherein a is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
It will be appreciated that there may be a variety of transformation models that meet the above conditions, including but not limited to: exclusive or conversion model, modular multiplication conversion model, modular addition conversion model.
In one possible implementation, a plurality of conversion models are stored in the conversion circuit, and different conversion models correspond to one or more groups of selectable conversion parameters. By storing the conversion model and the conversion parameters in the hardware circuit, an attacker cannot acquire the sensitive information, and the defending reliability of the program control flow is improved.
When the conversion circuit converts the return address, at least one conversion model is adopted to convert at least one bit of the return address, and the converted return address is obtained.
Wherein the at least one bit comprises a bit corresponding to a changed section of a code address of the program and a bit corresponding to a constant section; the code address of the program comprises instruction addresses of a plurality of instructions, the unchanged segment is a bit with the same bit in the instruction addresses, and the changed segment is a bit with different bits in the instruction addresses.
By way of example, taking the code region of libc as an example, comparing the start address and the end address of the code address of the program can find that several high-order bits of the code address, even more than ten or tens of high-order bits, are unchanged due to the limited number of instructions in the program. These unchanged bits are referred to herein as unchanged segments of the code address. Correspondingly, bits where the code address changes are referred to as changed sections of the code address.
In the embodiment of the application, when the conversion circuit converts at least one bit of the return address, the change section and the unchanged section are converted simultaneously, so that the brute force cracking difficulty of an attacker can be improved, and the safety of a program control flow is ensured.
Illustratively, prior to running the program, the translation circuit randomly selects a translation model and randomly selects a set of translation parameters to translate the return address during program operation. If the program is in error in running, namely ROP attack occurs, when the program is rerun, the conversion model and/or conversion parameters are replaced, so that an attacker cannot attack the same conversion model and conversion parameters for a plurality of times, and the safety of the program control flow is further improved.
The processing of the return address by the processor is described below in connection with several specific embodiments.
In a possible implementation, assuming that the processor is a 32-bit ARM instruction set based processor, a conversion circuit is disposed on the write path of the LR register, and a modular multiplication conversion model is used in the conversion circuit. The processing procedure of the return address by the processor is as follows:
1) The return address is stored in the LR register before the processing core executes a call instruction (e.g., invokes a func function). Since the translation circuit is disposed on the write path of the register, the return address is first translated by the translation circuit to obtain a translated return address, which is actually entered into the LR register.
For convenience of description, the return address output by the processing core (i.e., the original return address) is denoted as a [31:0] in this embodiment. The translated return address translated by the translation circuit (i.e., the return address actually entering the LR register) is denoted b [31:0]. The conversion circuit performs a modular multiplication operation as follows:
a×q≡b mod p
Wherein p and q are conversion parameters corresponding to the modular multiplication conversion model. As can be seen from the above equation, the return address a and the parameter q in the conversion circuit are modulo-multiplied, modulo p, and output as the converted return address b. Wherein q and p satisfy the following relationship:
q2≡1mod p
It will be appreciated that there may be a variety of combinations of p and q satisfying the above relationship. For example: if p=2 {32}, q may be selected to be any of:
4294967295,2147483649,2147483647
In practical applications, p should be larger than the largest code address but should not be too large to save the chip area of the conversion circuit. In addition, when q=p-1 or q=1, the above requirement is always satisfied, but q is not selected to be 1 for safety.
In this embodiment, the modular multiplication conversion model performs overall operation on the 32-bit return address a [31:0], that is, the variable section and the invariable section of the code address are operated simultaneously, so that an attacker cannot perform specific attack on the variable section only, and cannot ensure that each violent crack can jump to the code section, thereby improving the security of the program control flow.
2) The translated return address is read from the LR register and stored in the stack of memory.
3) The function func is executed.
4) Before returning from the function func, the translated return address is read from memory and the popped translated return address is stored in the LR register. Because the LR register has a translation circuit on the write path, the popped translation return address will first go through the translation circuit. The translation circuit translates the translated return address a further time to obtain the original return address, which is then stored in the LR register. The conversion circuit performs the same modular multiplication operation as follows:
b×q≡a mod p
Wherein, p and q are conversion parameters corresponding to the modular multiplication conversion model, and are the same as the parameters adopted in the first conversion process. The translated return address b is the input of the translation model and the return address a is the output of the translation model.
As can be seen from the two conversion processes, the return address is converted back to the original value after the two identical conversions.
5) A return operation is performed.
If the converted return address b is not tampered maliciously in the stack of the memory, the obtained return address a is the correct return address after the conversion of the conversion circuit, and the program is normally executed. If the converted return address b is tampered maliciously in the stack of the memory, since an attacker cannot know the sensitive information p and q and cannot construct a legal malicious converted return address, after the conversion of the conversion circuit, the obtained return address a is a messy code, and when the return operation is executed, the program executes the code at the messy code address, and memory errors can be generated with high probability. Thus, the attacker cannot achieve the goal of changing the program control flow.
In another possible implementation, assuming the processor is a 32-bit ARM instruction set based processor, a conversion circuit is disposed on the readout path of the LR register, and a modulo addition conversion model is employed in the conversion circuit. The processing procedure of the return address by the processor is as follows:
1) The return address is stored in the LR register before the processing core executes a call instruction (e.g., invokes a func function).
2) The return address is read from the LR register and stored in the stack of memory.
Because the conversion circuit is arranged on the readout path of the LR register, the return address output by the LR register is converted by the conversion circuit to obtain a converted return address, so that the return address is actually stacked.
For convenience of description, the return address output by the LR register (i.e., the original return address) is denoted as a [31:0] in this embodiment. The translated return address (i.e., the actual push return address) translated by the translation circuit is denoted b [31:0]. The conversion circuit performs a modulo addition operation as shown in the following equation:
a+q≡b mod p
wherein p and q are conversion parameters corresponding to the analog-to-digital conversion model. As can be seen from the above equation, the return address a and the parameter q in the conversion circuit are modulo-added, modulo-p, and output as the converted return address b. Wherein q and p satisfy the following relationship:
q≡p/2
it will be appreciated that there may be a variety of combinations of p and q satisfying the above relationship. For example: if p=2 {32} +2, then q=2 {31} +1.
In practical applications, p should be larger than the largest code address but should not be too large to save the chip area of the conversion circuit.
In this embodiment, the modulo addition conversion model performs overall operation on the 32-bit return address a [31:0], that is, the variable section and the invariable section of the code address are operated simultaneously, so that an attacker cannot perform specific attack on the variable section only, and cannot ensure that each brute force crack can jump to the code section, thereby improving the security of the program control flow.
3) The function func is executed.
4) Before returning from the function func, the translated return address is read from memory and the popped translated return address is stored in the LR register.
5) The translated return address is read from the LR register and output to the processing core for execution of the return operation.
Because the readout path of the LR register is provided with a conversion circuit, the popped converted return address will first pass through the conversion circuit. The translation circuitry translates the translated return address a further time to obtain the original return address, which is then output to the processing core. The conversion circuit performs the same modulo addition operation as follows:
b+q≡a mod p
Wherein, p and q are conversion parameters corresponding to the analog-to-digital conversion model, and are the same as the parameters adopted in the first conversion process. The translated return address b is the input of the translation model and the return address a is the output of the translation model.
As can be seen from the two conversion processes, the return address is converted back to the original value after the two identical conversions.
If the converted return address b is not tampered maliciously in the stack of the memory, the obtained return address a is the correct return address after the conversion of the conversion circuit, and the program is normally executed. If the converted return address b is tampered maliciously in the stack of the memory, since an attacker cannot know the sensitive information p and q and cannot construct a legal malicious converted return address, after the conversion of the conversion circuit, the obtained return address a is a messy code, and when the return operation is executed, the program executes the code at the messy code address, and memory errors can be generated with high probability. Thus, the attacker cannot achieve the goal of changing the program control flow.
In a further possible embodiment, the conversion circuit groups at least one bit of the return address to obtain a plurality of groups of bits; converting the bits in each bit group by adopting the conversion model to obtain a conversion result corresponding to each bit group, wherein the conversion models adopted by at least two bit groups in the plurality of bit groups are different or the conversion models adopted by the plurality of bit groups are the same; and then, according to the conversion result corresponding to each bit group, obtaining the conversion return address.
It will be appreciated that there may be a variety of ways to group at least one bit of the return address, and this embodiment is not particularly limited. Taking a 32-bit processor system as an example, the 32 bits of the return address may be divided into two bit groups, or may be divided into three bit groups, or may be divided into more bit groups, of course.
By way of example, when divided into two groups of bits, the first 16 bits may be grouped together and the last 16 bits may be grouped together; the first 8 bits may be used as a group, and the last 24 bits may be used as a group; it is also possible to group odd bits into one group and even bits into one group. It will be appreciated that other groupings exist and are not listed here. When divided into two bit groups, the conversion models employed by the two bit groups may be the same or different. For example: both bit groups adopt a modular multiplication conversion model, or both bit groups adopt a modular addition conversion model, or one bit group adopts a modular multiplication conversion model, and one bit group adopts a modular addition conversion model.
By way of example, when divided into three groups of bits, the first 8 bits may be grouped together, the middle 16 bits may be grouped together, and the last 8 bits may be grouped together; the first 10 bits may be a group, the middle 8 bits may be a group, and the last 14 bits may be a group; it is also possible to group bits 1,4, 7, 10, 13, 16, 19, 22, 25, 28, 31, bits 2,5, 8, 11, 14, 17, 20, 23, 26, 29, and bits 0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30. It will be appreciated that other groupings exist and are not listed here. When divided into three bit groups, the conversion models employed by the three bit groups may be the same or different.
The following is described in connection with examples.
Assuming the processor is a 32-bit ARM instruction set based processor, the translation circuitry is placed on the write path of the LR registers. The conversion circuit divides 32 bits of the return address into two bit groups, wherein one bit group comprises bits corresponding to odd bits of the return address, and the other bit group comprises bits corresponding to even bits of the return address. One of the bit groups uses a modular multiplication conversion model and the other bit group uses a modular addition conversion model. The processing procedure of the return address by the processor is as follows:
1) The return address is stored in the LR register before the processing core executes a call instruction (e.g., invokes a func function). Since the translation circuit is disposed on the write path of the register, the return address is first translated by the translation circuit to obtain a translated return address, which is actually entered into the LR register.
For convenience of description, the return address output by the processing core (i.e., the original return address) is denoted as a [31:0] in this embodiment. A [31:0] is divided into two bit groups, a 1 [15:0] and a 2 [15:0], respectively, wherein,
a1[15:0]≡{a[31],a[29],a[27],…,a[1]}
a2[15:0]≡{a[30],a[28],a[26],…,a[0]}
Accordingly, the translated return address translated by the translation circuit (i.e., the return address actually entering the LR register) is denoted b [31:0]. B [31:0] is divided into two bit groups, b 1 [15:0] and b 2 [15:0], respectively, wherein,
b1[15:0]≡{b[31],b[29],b[27],…,b[1]}
b2[15:0]≡{b[30],b[28],b[26],…,b[0]}
When converting the return address, the conversion circuit adopts a modular multiplication conversion model for a 1 [15:0] and adopts a modular addition conversion model for a 2 [15:0], as follows:
a1×q1≡b1mod p1
a2+q2≡b2modp2
Wherein, p 1 and q 1 are conversion parameters corresponding to the modular multiplication conversion model, and p 2 and q 2 are conversion parameters corresponding to the modular addition conversion model. As can be seen from the above equation, a 1 and the parameter q 1 in the conversion circuit are subjected to a modulo multiplication, the modulo is p 1, the output is b 1.a2, and the parameter q 2 in the conversion circuit is subjected to a modulo addition, the modulo is p 2, and the output is b 2.
Wherein p 1 and q 1 satisfy the following relationship:
q1 2≡1mod p1
It will be appreciated that there may be a variety of combinations of p 1 and q 1 which satisfy the above relationship. For example: if p 1 =2 {16} +8, then q 1 may be selected as any one of:
65543,60083,54619,49159,…,5461
p 2 and q 2 satisfy the following relationship:
q2≡p2/2
It will be appreciated that there may be a variety of combinations of p 2 and q 2 which satisfy the above relationship. For example: if p 2 = 2 {32} +6, then q 2 = 2 {31} +3.
In practice, p 1 and p 2 should be larger than the maximum code address but should not be too large to save the chip area of the conversion circuit.
In this embodiment, for the return address a of 32 bits, the 32 bits are divided into two groups according to odd bits and even bits, one group adopts a modular multiplication conversion model, and the other group adopts a modular addition conversion model. Therefore, the variable section and the invariable section of the code address are operated simultaneously, so that an attacker cannot conduct specific attack only on the variable section, the attacker cannot ensure that each violent crack can jump to the code area, and the safety of the program control flow is improved.
2) The translated return address is read from the LR register and stored in the stack of memory.
3) The function func is executed.
4) Before returning from the function func, the translated return address is read from memory and the popped translated return address is stored in the LR register. Because the LR register has a translation circuit on the write path, the popped translation return address will first go through the translation circuit. The translation circuit translates the translated return address a further time to obtain the original return address, which is then stored in the LR register.
When converting the converted return address, the conversion circuit adopts a modular multiplication conversion model for b 1 [15:0] and a modular addition conversion model for b 2 [15:0], as follows:
b1×q1≡a1mod p1
b2+q2≡a2mod p2
Wherein, p 1 and q 1 are conversion parameters corresponding to the modular multiplication conversion model, which are the same as the parameters adopted in the first conversion process. p 2 and q 2 are conversion parameters corresponding to the analog-to-digital conversion model, and are the same as the parameters adopted in the first conversion process.
As can be seen from the two conversion processes, the return address is converted back to the original value after the two identical conversions.
5) A return operation is performed.
If the converted return address b is not tampered maliciously in the stack of the memory, the obtained return address a is the correct return address after the conversion of the conversion circuit, and the program is normally executed. If the converted return address b is tampered maliciously in the stack of the memory, since an attacker cannot know the sensitive information p 1、q1、p2 and q 2 and cannot construct a legal malicious converted return address, after the conversion of the conversion circuit, the obtained return address a is a messy code, and when the return operation is executed, the program executes the code at the messy code address, and memory errors can be generated with high probability. Thus, the attacker cannot achieve the goal of changing the program control flow.
Fig. 8 is a flow chart of a processing method of a return address according to an embodiment of the present application. The method of the present embodiment is performed by a processor, wherein the processor comprises: a processing core and a conversion circuit. As shown in fig. 8, the method of the present embodiment includes:
S801: when the processing core outputs the return address, the return address is translated by the translation circuit to obtain a translated return address, and the translated return address is output to a stack in memory.
S802: when the return address is needed to be used, the conversion circuit is used for carrying out the conversion on the conversion return address in the stack, obtaining the return address, and outputting the return address to the processing core.
In a possible implementation manner, the processor further includes a register, the converting, by the converting circuit, the return address to obtain a converted return address, and outputting the converted return address to a stack in a memory, including:
converting the return address by the converting circuit to obtain a converted return address, and outputting the converted return address to the register so that the converted return address is output to a stack in a memory via the register;
said converting, by said converting circuit, said translated return address in said stack to said return address and outputting said return address to said processing core, comprising:
The translation is performed on the translated return address in the stack by the translation circuit to obtain the return address, and the return address is output to the register to cause the return address to be output to the processing core via the register.
In a possible implementation manner, the processor further includes a register, the return address is output to the register by the processing core, the converting, by the converting circuit, the return address to obtain a converted return address, and the converting return address is output to a stack in a memory, including:
Converting the return address output by the register by the conversion circuit to obtain a converted return address, and outputting the converted return address to a stack in a memory;
said converting, by said converting circuit, said translated return address in said stack to said return address and outputting said return address to said processing core, comprising:
And converting the converted return address output by the register through the conversion circuit to obtain the return address, and outputting the return address to the processing core.
In a possible embodiment, the conversion satisfies the following condition:
B=IP(A),A=IP(B)
wherein a is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
In a possible implementation manner, the converting the return address to obtain a converted return address includes:
and converting at least one bit of the return address by adopting at least one conversion model to obtain the converted return address.
In a possible implementation manner, the converting at least one bit of the return address by using at least one conversion model to obtain the converted return address includes:
grouping at least one bit of the return address to obtain a plurality of bit groups;
Converting the bits in each bit group by adopting the conversion model to obtain a conversion result corresponding to each bit group, wherein the conversion models adopted by at least two bit groups in the plurality of bit groups are different or the conversion models adopted by the plurality of bit groups are the same;
and obtaining the conversion return address according to the conversion result corresponding to each bit group.
In a possible implementation manner, the number of the bit groups is two, wherein one bit group includes bits corresponding to odd bits of the return address, and the other bit group includes bits corresponding to even bits of the return address.
In a possible implementation manner, the types of the conversion models include: a modular multiplication conversion model and a modular addition conversion model.
In a possible implementation, the register is a register for storing a return address.
In a possible implementation, the processor is an ARM instruction set based processor, and the register is an LR register.
In a possible implementation, the processor is a RISC V instruction set based processor, and the register is an RA register.
The processing method of the return address provided in this embodiment may be applied to the processor described in any of the foregoing embodiments, and its implementation principle and technical effects are similar, and will not be repeated here.
The embodiment of the application also provides electronic equipment, which comprises: the processor may adopt the structure of the processor in any of the above embodiments, and its implementation principle and technical effects are similar, and this embodiment will not be repeated here.
The embodiment of the application also provides a chip, which comprises: the processor may adopt the structure of the processor in any of the above embodiments, and its implementation principle and technical effects are similar, and will not be described herein.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present application may be integrated in one processing unit, or each module may exist alone physically, or two or more modules may be integrated in one unit. The units formed by the modules can be realized in a form of hardware or a form of hardware and software functional units.
The integrated modules, which are implemented in the form of software functional modules, may be stored in a computer readable storage medium. The software functional module is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (english: processor) to perform some of the steps of the methods according to the embodiments of the application.
It should be understood that the above Processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), or may be other general purpose processors, a digital signal Processor (english: DIGITAL SIGNAL Processor, abbreviated as DSP), an Application-specific integrated Circuit (english: application SPECIFIC INTEGRATED Circuit, abbreviated as ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.
The memory may comprise a high-speed RAM memory, and may further comprise a non-volatile memory NVM, such as at least one magnetic disk memory, and may also be a U-disk, a removable hard disk, a read-only memory, a magnetic disk or optical disk, etc.
The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (PERIPHERAL COMPONENT, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or to one type of bus.
The storage medium may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an Application SPECIFIC INTEGRATED Circuits (ASIC). It is also possible that the processor and the storage medium reside as discrete components in an electronic device or a master device.
Claims (37)
1. A processor comprising a processor, a memory, and a control unit, characterized by comprising the following steps: a processing core, a conversion circuit, and a register;
The processing core is used for outputting a return address;
The conversion circuit is used for converting the return address output by the processing core to obtain a converted return address and outputting the converted return address to a stack in a memory;
The conversion circuit is further configured to, when the processing core needs to use the return address, perform the conversion on the converted return address in the stack to obtain the return address, and output the return address to the processing core;
the conversion circuit is specifically configured to convert a return address output by the processing core to obtain a converted return address, and output the converted return address to the register, so that the converted return address is output to a stack in a memory through the register;
The translation circuit is further specifically configured to, when the processing core needs to use the return address, perform the translation on the translated return address in the stack to obtain the return address, and output the return address to the register, so that the return address is output to the processing core through the register.
2. The processor of claim 1, wherein the conversion satisfies the following condition:
B=IP(A),A=IP(B)
wherein a is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
3. The processor of claim 2, wherein the conversion circuit is specifically configured to:
and converting at least one bit of the return address by adopting at least one conversion model to obtain the converted return address.
4. A processor according to claim 3, wherein the conversion circuit is specifically configured to:
grouping at least one bit of the return address to obtain a plurality of bit groups;
Converting the bits in each bit group by adopting the conversion model to obtain a conversion result corresponding to each bit group, wherein the conversion models adopted by at least two bit groups in the plurality of bit groups are different or the conversion models adopted by the plurality of bit groups are the same;
and obtaining the conversion return address according to the conversion result corresponding to each bit group.
5. The processor of claim 4, wherein the number of bit groups is two, wherein one bit group includes bits corresponding to odd bits of the return address and the other bit group includes bits corresponding to even bits of the return address.
6. The processor according to any one of claims 2 to 5, wherein the kinds of conversion models include: a modular multiplication conversion model and a modular addition conversion model.
7. A processor according to any one of claims 2 to 5, wherein the register is a register for storing a return address.
8. The processor of claim 7, wherein the processor is an ARM instruction set based processor and the register is an LR register.
9. The processor of claim 7, wherein the processor is a RISC V instruction set based processor and the register is an RA register.
10. A processor comprising a processor, a memory, and a control unit, characterized by comprising the following steps: a processing core, a conversion circuit, and a register;
The processing core is used for outputting a return address;
The conversion circuit is used for converting the return address output by the processing core to obtain a converted return address and outputting the converted return address to a stack in a memory;
The conversion circuit is further configured to, when the processing core needs to use the return address, perform the conversion on the converted return address in the stack to obtain the return address, and output the return address to the processing core;
the register is used for registering the return address output by the processing core when the processing core outputs the return address;
The conversion circuit is specifically configured to convert the return address output by the register to obtain a converted return address, and output the converted return address to a stack in a memory;
The register is further configured to register the translated return address of the stack output when the processing core needs to use the return address;
The conversion circuit is further specifically configured to perform the conversion on the converted return address output by the register, obtain the return address, and output the return address to the processing core.
11. The processor of claim 10, wherein the conversion satisfies the following condition:
B=IP(A),A=IP(B)
wherein a is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
12. The processor of claim 11, wherein the conversion circuit is specifically configured to:
and converting at least one bit of the return address by adopting at least one conversion model to obtain the converted return address.
13. The processor of claim 12, wherein the conversion circuit is specifically configured to:
grouping at least one bit of the return address to obtain a plurality of bit groups;
Converting the bits in each bit group by adopting the conversion model to obtain a conversion result corresponding to each bit group, wherein the conversion models adopted by at least two bit groups in the plurality of bit groups are different or the conversion models adopted by the plurality of bit groups are the same;
and obtaining the conversion return address according to the conversion result corresponding to each bit group.
14. The processor of claim 13, wherein the number of bit groups is two, wherein one bit group includes bits corresponding to odd bits of the return address and the other bit group includes bits corresponding to even bits of the return address.
15. The processor according to any one of claims 11 to 13, wherein the kinds of conversion models include: a modular multiplication conversion model and a modular addition conversion model.
16. A processor according to any one of claims 11 to 13, wherein the register is a register for storing a return address.
17. The processor of claim 16, wherein the processor is an ARM instruction set based processor and the register is an LR register.
18. The processor of claim 16, wherein the processor is a RISC V instruction set based processor and the register is a RA register.
19. A method of processing a return address, applied to a processor, the processor comprising: a processing core and a translation circuit, the method comprising:
Converting the return address by the conversion circuit to obtain a converted return address and outputting the converted return address to a stack in a memory when the processing core outputs the return address;
when the return address is required to be used, the conversion circuit is used for carrying out the conversion on the converted return address in the stack to obtain the return address, and the return address is output to the processing core;
the processor further includes a register that translates the return address by the translation circuit to obtain a translated return address and outputs the translated return address to a stack in memory, comprising:
converting the return address by the converting circuit to obtain a converted return address, and outputting the converted return address to the register so that the converted return address is output to a stack in a memory via the register;
said converting, by said converting circuit, said translated return address in said stack to said return address and outputting said return address to said processing core, comprising:
The translation is performed on the translated return address in the stack by the translation circuit to obtain the return address, and the return address is output to the register to cause the return address to be output to the processing core via the register.
20. The method of claim 19, wherein the conversion satisfies the following condition:
B=IP(A),A=IP(B)
wherein a is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
21. The method of claim 20, wherein translating the return address to obtain a translated return address comprises:
and converting at least one bit of the return address by adopting at least one conversion model to obtain the converted return address.
22. The method of claim 21, wherein said translating at least one bit of said return address using at least one translation model to obtain said translated return address comprises:
grouping at least one bit of the return address to obtain a plurality of bit groups;
Converting the bits in each bit group by adopting the conversion model to obtain a conversion result corresponding to each bit group, wherein the conversion models adopted by at least two bit groups in the plurality of bit groups are different or the conversion models adopted by the plurality of bit groups are the same;
and obtaining the conversion return address according to the conversion result corresponding to each bit group.
23. The method of claim 22, wherein the number of bit groups is two, wherein one bit group includes bits corresponding to odd bits of the return address and the other bit group includes bits corresponding to even bits of the return address.
24. The method according to any one of claims 20 to 23, wherein the kinds of conversion models include: a modular multiplication conversion model and a modular addition conversion model.
25. A method according to any one of claims 20 to 23, wherein the register is a register for storing a return address.
26. The method of claim 25, wherein the processor is an ARM instruction set based processor and the register is an LR register.
27. The method of claim 25, wherein the processor is a RISC V instruction set based processor and the register is a RA register.
28. A method of processing a return address, applied to a processor, the processor comprising: a processing core and a translation circuit, the method comprising:
Converting the return address by the conversion circuit to obtain a converted return address and outputting the converted return address to a stack in a memory when the processing core outputs the return address;
when the return address is required to be used, the conversion circuit is used for carrying out the conversion on the converted return address in the stack to obtain the return address, and the return address is output to the processing core;
The processor further includes a register to which the return address is output by the processing core, the translating the return address by the translation circuitry to obtain a translated return address, and outputting the translated return address to a stack in memory, comprising:
Converting the return address output by the register by the conversion circuit to obtain a converted return address, and outputting the converted return address to a stack in a memory;
said converting, by said converting circuit, said translated return address in said stack to said return address and outputting said return address to said processing core, comprising:
And converting the converted return address output by the register through the conversion circuit to obtain the return address, and outputting the return address to the processing core.
29. The method of claim 28, wherein the converting satisfies the condition:
B=IP(A),A=IP(B)
wherein a is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
30. The method of claim 29, wherein translating the return address to obtain a translated return address comprises:
and converting at least one bit of the return address by adopting at least one conversion model to obtain the converted return address.
31. The method of claim 30, wherein said converting at least one bit of said return address using at least one conversion model to obtain said converted return address comprises:
grouping at least one bit of the return address to obtain a plurality of bit groups;
Converting the bits in each bit group by adopting the conversion model to obtain a conversion result corresponding to each bit group, wherein the conversion models adopted by at least two bit groups in the plurality of bit groups are different or the conversion models adopted by the plurality of bit groups are the same;
and obtaining the conversion return address according to the conversion result corresponding to each bit group.
32. The method of claim 31, wherein the number of bit groups is two, wherein one bit group includes bits corresponding to odd bits of the return address and the other bit group includes bits corresponding to even bits of the return address.
33. The method according to any one of claims 29 to 32, wherein the kinds of conversion models include: a modular multiplication conversion model and a modular addition conversion model.
34. A method according to any one of claims 29 to 32, wherein the register is a register for storing a return address.
35. The method of claim 34, wherein the processor is an ARM instruction set based processor and the register is an LR register.
36. The method of claim 34, wherein the processor is a RISC V instruction set based processor and the register is a RA register.
37. An electronic device comprising a processor as claimed in any one of claims 1 to 9 or 10 to 18.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910586325.5A CN112181491B (en) | 2019-07-01 | 2019-07-01 | Processor and return address processing method |
PCT/CN2020/099168 WO2021000847A1 (en) | 2019-07-01 | 2020-06-30 | Processor and return address processing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910586325.5A CN112181491B (en) | 2019-07-01 | 2019-07-01 | Processor and return address processing method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112181491A CN112181491A (en) | 2021-01-05 |
CN112181491B true CN112181491B (en) | 2024-09-24 |
Family
ID=73915579
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910586325.5A Active CN112181491B (en) | 2019-07-01 | 2019-07-01 | Processor and return address processing method |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112181491B (en) |
WO (1) | WO2021000847A1 (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107710151A (en) * | 2015-06-24 | 2018-02-16 | 英特尔公司 | The technology that shadow storehouse for binary file converting system manipulates |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7086088B2 (en) * | 2002-05-15 | 2006-08-01 | Nokia, Inc. | Preventing stack buffer overflow attacks |
US20140173290A1 (en) * | 2012-12-17 | 2014-06-19 | Advanced Micro Devices, Inc. | Return address tracking mechanism |
US9037872B2 (en) * | 2012-12-17 | 2015-05-19 | Advanced Micro Devices, Inc. | Hardware based return pointer encryption |
US9665374B2 (en) * | 2014-12-18 | 2017-05-30 | Intel Corporation | Binary translation mechanism |
DE102015113468A1 (en) * | 2015-08-14 | 2017-02-16 | Infineon Technologies Ag | DATA PROCESSING DEVICE AND METHOD FOR SECURING A DATA PROCESSING AGAINST ATTACKS |
US20170090927A1 (en) * | 2015-09-30 | 2017-03-30 | Paul Caprioli | Control transfer instructions indicating intent to call or return |
US10289842B2 (en) * | 2015-11-12 | 2019-05-14 | Samsung Electronics Co., Ltd. | Method and apparatus for protecting kernel control-flow integrity using static binary instrumentation |
CN106022166B (en) * | 2016-06-02 | 2018-10-23 | 东北大学 | A kind of code reuse attack defending system and method |
CN109409085A (en) * | 2018-09-21 | 2019-03-01 | 中国科学院信息工程研究所 | The method and device that return address is tampered in processing storehouse |
CN109361507B (en) * | 2018-10-11 | 2021-11-02 | 杭州华澜微电子股份有限公司 | Data encryption method and encryption equipment |
CN109858253B (en) * | 2019-01-08 | 2021-04-20 | 中国人民解放军战略支援部队信息工程大学 | LBR-based stack buffer overflow attack defense method |
-
2019
- 2019-07-01 CN CN201910586325.5A patent/CN112181491B/en active Active
-
2020
- 2020-06-30 WO PCT/CN2020/099168 patent/WO2021000847A1/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107710151A (en) * | 2015-06-24 | 2018-02-16 | 英特尔公司 | The technology that shadow storehouse for binary file converting system manipulates |
Also Published As
Publication number | Publication date |
---|---|
CN112181491A (en) | 2021-01-05 |
WO2021000847A1 (en) | 2021-01-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11784786B2 (en) | Mitigating security vulnerabilities with memory allocation markers in cryptographic computing systems | |
US9165138B2 (en) | Mitigation of function pointer overwrite attacks | |
EP3326105B1 (en) | Technologies for secure programming of a cryptographic engine for secure i/o | |
CN110945509B (en) | Apparatus and method for controlling access to data in a protected memory region | |
US8583880B2 (en) | Method for secure data reading and data handling system | |
US10678707B2 (en) | Data processing device and method for cryptographic processing of data | |
US11232194B2 (en) | Method for executing a binary code of a secure function with a microprocessor | |
EP2310976A1 (en) | Secure memory management system and method | |
CN113673002B (en) | Memory overflow defense method based on pointer encryption mechanism and RISC-V coprocessor | |
US10572666B2 (en) | Return-oriented programming mitigation | |
US7774587B2 (en) | Dynamic redundancy checker against fault injection | |
EP3454216B1 (en) | Method for protecting unauthorized data access from a memory | |
US20070083770A1 (en) | System and method for foiling code-injection attacks in a computing device | |
US20080133858A1 (en) | Secure Bit | |
US20210342486A1 (en) | Encrypted data processing | |
CN112181491B (en) | Processor and return address processing method | |
US20230281305A1 (en) | Method for protecting against side-channel attacks | |
JP2023065323A (en) | Computer-implemented method, system and computer program | |
US11677541B2 (en) | Method and device for secure code execution from external memory | |
US12174939B2 (en) | Method for the execution of a binary code of a computer program by a microprocessor | |
US12088722B2 (en) | Method for executing a computer program by means of an electronic apparatus | |
CN117216813B (en) | Method, device and security chip for reading and writing data | |
US11651086B2 (en) | Method for executing a computer program by means of an electronic apparatus | |
CN117786699A (en) | Chip initialization method, device, module, electronic equipment and storage medium | |
CN114600418A (en) | Method for preventing bit flipping attack of system on chip |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |