[go: up one dir, main page]

CN112181491A - Processor and return address processing method - Google Patents

Processor and return address processing method Download PDF

Info

Publication number
CN112181491A
CN112181491A CN201910586325.5A CN201910586325A CN112181491A CN 112181491 A CN112181491 A CN 112181491A CN 201910586325 A CN201910586325 A CN 201910586325A CN 112181491 A CN112181491 A CN 112181491A
Authority
CN
China
Prior art keywords
return address
conversion
register
processor
processing core
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910586325.5A
Other languages
Chinese (zh)
Other versions
CN112181491B (en
Inventor
钱雅超
章庆隆
汤倩莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201910586325.5A priority Critical patent/CN112181491B/en
Priority to PCT/CN2020/099168 priority patent/WO2021000847A1/en
Publication of CN112181491A publication Critical patent/CN112181491A/en
Application granted granted Critical
Publication of CN112181491B publication Critical patent/CN112181491B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • G06F9/3013Organisation of register space, e.g. banked or distributed register file according to data content, e.g. floating-point registers, address registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/3017Runtime instruction translation, e.g. macros
    • G06F9/30178Runtime instruction translation, e.g. macros of compressed or encrypted instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/468Specific access rights for resources, e.g. using capability register

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Storage Device Security (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The embodiment of the application provides a processor and a processing method of a return address, wherein a hardware conversion circuit is arranged in the processor, when the return address needs to be stored, the return address is converted by using the conversion circuit, and the obtained converted return address is output to a memory; when the return address is needed, the conversion circuit is used for converting the conversion return address in the memory to obtain the return address. Because the attacker can not know the conversion operation in the conversion circuit, the attacker can not modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, and thus, the attacker can be prevented from maliciously changing the program control flow. In addition, the conversion process is realized by a hardware conversion circuit in the program running process, so that identification of the calling instruction and the returning instruction is not required in the compiling stage, and extra encryption instructions and decryption instructions are not required to be inserted, thereby avoiding influence on the running performance of the processor.

Description

Processor and return address processing method
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a processor and a return address processing method.
Background
During the process of running the program by the processor, an attacker can maliciously hijack the control flow of the program. Specifically, an attacker modifies the return address of the subprogram, modifies the normal return address into a malicious return address, so that the processor jumps to a code segment pointed by the malicious return address after executing the subprogram code, thereby achieving the purpose of changing the program Control Flow and destroying the Control Flow Integrity (CFI) of the program.
At present, in order to protect program control flow from being maliciously changed, the control flow during program operation generally needs to be monitored, and if the program control flow is changed, an alarm is given. In a related technology, in a program compiling stage, a calling instruction and a returning instruction corresponding to a subprogram are identified, an encryption instruction is inserted before the calling instruction, and a decryption instruction is inserted before the returning instruction. Furthermore, in the program running stage, before the processor calls the subprogram, the return address of the subprogram is encrypted by using an encryption instruction, and the obtained encrypted address is stacked. And after the subprogram is executed, decrypting the popped encrypted address by using the decryption instruction to obtain an original return address, so that the processor can continue to execute from the return address.
After the defense technology is adopted, even if an attacker hijacks the encrypted address from the stack, the encrypted address cannot be tampered into the encrypted malicious address because the attacker does not know the key adopted by the encryption instruction. That is to say, after hijacking the program control flow, the attacker still cannot control the jump position of the program, so that the attacker can be prevented from maliciously changing the program control flow, and the integrity of the program control flow is protected.
However, in the above-described technique, a plurality of additional instructions need to be inserted into the program, so that the execution performance of the program is degraded.
Disclosure of Invention
The embodiment of the application provides a processor and a processing method of a return address, which can protect the control flow integrity of a program on the basis of not reducing the running performance of the program.
In a first aspect, an embodiment of the present application provides a processor, including: a processing core and a conversion circuit;
the processing core is used for outputting a return address;
the conversion circuit is used for converting the return address output by the processing core to obtain a conversion return address and outputting the conversion return address to a stack in the memory;
the conversion circuit is further configured to perform the conversion on the conversion return address in the stack to obtain the return address when the processing core needs to use the return address, and output the return address to the processing core.
In this embodiment, the conversion circuit performs a conversion on the return address before the return address is stored in the memory, so that the converted return address is stored in the memory. Because the attacker can not know the conversion operation in the conversion circuit, the attacker can not modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, and thus, the attacker can be prevented from maliciously changing the program control flow. After the converted return address is popped from the memory, the conversion circuit converts the converted return address again to obtain an original return address, so that the processing core can execute subsequent instructions according to the original return address, and the integrity of the program control flow is ensured. Because the conversion process is realized by a hardware conversion circuit in the program running process, the calling instruction and the returning instruction do not need to be identified in the compiling stage, and extra encryption instructions and decryption instructions do not need to be inserted, thereby avoiding the influence on the running performance of the processor. Meanwhile, the software stealing risk is avoided.
Optionally, the processor further comprises a register;
the translation circuit is specifically configured to translate a return address output by the processing core to obtain a translation return address, and output the translation return address to the register, so that the translation return address is output to a stack in the memory via the register;
the translation circuit is further specifically configured to, when the processing core needs to use the return address, perform the translation on the translation return address in the stack to obtain the return address, and output the return address to the register, so that the return address is output to the processing core via the register.
In this embodiment, a hardware conversion circuit is disposed on a write path of the register, and the conversion circuit is configured to convert a return address entering the register, that is, all inputs that need to enter the register are converted by the conversion circuit first, and then a conversion result is input into the register. By the method, the return address can be automatically identified, the control flow of the processor does not need to be changed, and the method is easy to implement.
Optionally, the processor further comprises a register;
when the processing core outputs the return address, the register is used for registering the return address output by the processing core;
the conversion circuit is specifically configured to convert the return address output by the register to obtain a conversion return address, and output the conversion return address to a stack in a memory;
when the processing core needs to use the return address, the register is further used for registering the conversion return address output by the stack;
the conversion circuit is further specifically configured to perform the conversion on the converted return address output by the register to obtain the return address, and output the return address to the processing core.
In this embodiment, a hardware conversion circuit is provided in the read path of the register, and the conversion circuit is used to convert the return address output from the register, that is, all the values output from the register are converted by the conversion circuit first. By the method, the return address can be automatically identified, the control flow of the processor does not need to be changed, and the method is easy to implement.
Optionally, the conversion satisfies the following condition:
B=IP(A),A=IP(B)
wherein A is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
Optionally, the conversion circuit is specifically configured to:
and converting at least one bit of the return address by adopting at least one conversion model to obtain the converted return address.
Wherein the at least one bit comprises a bit corresponding to a changed section and a bit corresponding to an unchanged section of a code address of the program; the code address of the program comprises instruction addresses of a plurality of instructions, the invariable section is a bit with the same bit in the instruction addresses, and the variable section is a bit with different bit in the instruction addresses.
By storing the conversion model and the conversion parameters in the hardware circuit, an attacker cannot acquire the sensitive information, so that the defense reliability of the program control flow is improved; when the conversion circuit converts at least one bit of the return address, the conversion circuit simultaneously converts the variable section and the invariable section, so that the brute force cracking difficulty of an attacker can be improved, and the safety of the program control flow is ensured.
Optionally, the conversion circuit is specifically configured to:
grouping at least one bit of the return address to obtain a plurality of bit groups;
converting the bits in each bit group by using the conversion model to obtain a conversion result corresponding to each bit group, wherein at least two bit groups in the plurality of bit groups have different conversion models, or the bit groups have the same conversion model;
and obtaining the conversion return address according to the conversion result corresponding to each bit group.
Optionally, the number of the bit groups is two, where one bit group includes bits corresponding to odd-numbered bits of the return address, and the other bit group includes bits corresponding to even-numbered bits of the return address.
Optionally, the types of the conversion model include: a modular multiplication conversion model and a modular addition conversion model.
Optionally, the register is a register for storing a return address.
Optionally, the processor is based on an ARM instruction set, and the register is an LR register.
Optionally, the processor is a RISC V instruction set-based processor, and the register is an RA register.
In a second aspect, an embodiment of the present application provides a method for processing a return address, which is applied to a processor, where the processor includes: a processing core and a translation circuit, the method comprising:
when the processing core outputs the return address, the return address is converted through the conversion circuit to obtain a conversion return address, and the conversion return address is output to a stack in a memory;
when the return address needs to be used, the conversion circuit performs the conversion on the conversion return address in the stack to obtain the return address, and the return address is output to the processing core.
Optionally, the processor further includes a register, where the converting circuit converts the return address to obtain a converted return address, and outputs the converted return address to a stack in a memory, where the converting circuit includes:
converting the return address through the conversion circuit to obtain a conversion return address, and outputting the conversion return address to the register so that the conversion return address is output to a stack in the memory through the register;
the converting, by the conversion circuit, the converted return address in the stack to obtain the return address, and outputting the return address to the processing core includes:
and performing the conversion on the conversion return address in the stack through the conversion circuit to obtain the return address, and outputting the return address to the register so that the return address is output to the processing core through the register.
Optionally, the processor further includes a register, the return address is output to the register by the processing core, the converting circuit converts the return address to obtain a converted return address, and outputs the converted return address to a stack in the memory, where the converting circuit includes:
converting the return address output by the register through the conversion circuit to obtain a conversion return address, and outputting the conversion return address to a stack in a memory;
the converting, by the conversion circuit, the converted return address in the stack to obtain the return address, and outputting the return address to the processing core includes:
and performing the conversion on the conversion return address output by the register through the conversion circuit to obtain the return address, and outputting the return address to the processing core.
Optionally, the conversion satisfies the following condition:
B=IP(A),A=IP(B)
wherein A is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
Optionally, the converting the return address to obtain a converted return address includes:
and converting at least one bit of the return address by adopting at least one conversion model to obtain the converted return address.
Optionally, the converting at least one bit of the return address by using at least one conversion model to obtain the converted return address includes:
grouping at least one bit of the return address to obtain a plurality of bit groups;
converting the bits in each bit group by using the conversion model to obtain a conversion result corresponding to each bit group, wherein at least two bit groups in the plurality of bit groups have different conversion models, or the bit groups have the same conversion model;
and obtaining the conversion return address according to the conversion result corresponding to each bit group.
Optionally, the number of the bit groups is two, where one bit group includes bits corresponding to odd-numbered bits of the return address, and the other bit group includes bits corresponding to even-numbered bits of the return address.
Optionally, the types of the conversion model include: a modular multiplication conversion model and a modular addition conversion model.
Optionally, the register is a register for storing a return address.
Optionally, the processor is based on an ARM instruction set, and the register is an LR register.
Optionally, the processor is a RISC V instruction set-based processor, and the register is an RA register.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor as claimed in any one of the first aspect.
In a fourth aspect, an embodiment of the present application provides a chip, including: a processor as claimed in any one of the first aspect.
The processor and the processing method of the return address provided by the embodiment of the application are characterized in that a hardware conversion circuit is arranged in the processor, when the return address needs to be stored, the return address is converted by the conversion circuit, and the obtained converted return address is output to a memory; when the return address is needed, the conversion circuit is used for converting the conversion return address in the memory to obtain the return address. In the embodiment of the application, because the attacker cannot know the conversion operation performed in the conversion circuit, the attacker cannot modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, and thus, the attacker can be prevented from maliciously changing the program control flow. Moreover, because the conversion process is realized by a hardware conversion circuit in the program running process, compared with the related technology, the calling instruction and the returning instruction do not need to be identified in the compiling stage, and extra encryption instructions and decryption instructions do not need to be inserted, so that the influence on the running performance of the processor is avoided.
Drawings
Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of a processor according to an embodiment of the present application;
fig. 3 is a schematic diagram of a processing procedure of a return address according to an embodiment of the present application;
FIGS. 4A and 4B are schematic diagrams illustrating a conventional return address processing procedure;
fig. 5A and fig. 5B are schematic diagrams illustrating a processing procedure of a return address according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a program running process provided in an embodiment of the present application;
fig. 7A and fig. 7B are schematic diagrams of a return address processing procedure according to an embodiment of the present application;
fig. 8 is a flowchart illustrating a return address processing method according to an embodiment of the present application.
Detailed Description
To facilitate understanding of the present application, first, a structure of an electronic device to which a processor of the present application is applied will be described with reference to fig. 1.
Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 1, a processor 100 and a memory (memory)200 are included in the electronic device 10. The memory 200 is used for storing, among other things, computer programs and data. The memory is widely classified into a main memory and a secondary memory according to its use. Main memory, also known as "internal memory," or simply "internal memory," is used to temporarily store computer programs and data during operation of the processor. The auxiliary memory is also called "external memory", which is called "external memory" for short, and is used for storing computer programs and data that are not used temporarily during the operation of the processor. The processor 100 is arranged to execute a computer program stored in the memory 200.
The processor is a core device of the electronic equipment. A processor typically includes at least one processing core, cache memory, and an input-output interface to communicate with other devices of the electronic device. A processing core refers to a processing unit in a processor for performing data processing tasks. The processing core is the main device in the processor responsible for the operation.
The processor in the present application may be a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or the like.
The processor may be adapted to execute a computer program in the electronic device. The processor is capable of recognizing and executing instructions in the computer program, so that the electronic device performs a certain function or obtains a certain result.
A computer program is a sequence of instructions that is made up of a number of instructions. Subroutines may also be called in the computer program. A subroutine may also be referred to as a "subprocess," or "subfunction. The subprogram is composed of one or more instructions, is responsible for completing a specific task, and has relative independence. Generally, a program including a calling subroutine is referred to as a main program. The main program and the subprograms are relative, for example: program a calls program B, which in turn calls program C. Then, program B is a subroutine with respect to program a, and program B is a main routine with respect to program C.
The normal flow of execution of a computer program is called program control flow. The processor executes the instructions according to a program control flow during execution of the computer program. However, during the execution of a computer program by a processor, an event may occur in which an attacker maliciously hijacks the program control flow. The purpose of these malicious attack events is typically to change the Control Flow of the program, thereby destroying the Control Flow Integrity (CFI) of the program. By way of example, one common malicious attack event that corrupts the CFI of a program is a return-address-oriented programming (ROP) attack.
The following describes, with reference to a specific example, a process in which the processor executes the computer program in a normal case, and a process in which the processor executes the computer program in a ROP attack case.
Assume that the computer program includes: instruction a, instruction B, and instruction C. Wherein, the instruction B is a subprogram calling instruction. The address of instruction A is 0x0000, the address of instruction B is 0x0004, and the address of instruction C is 0x 0008. The normal program execution flow is to execute instruction a, instruction B, and instruction C in sequence. The normal flow of the program executed by the processor is as follows.
1) The processor executes instruction a.
Because the instruction B is a subprogram calling instruction, the processor jumps to the address of the subprogram corresponding to the instruction B when executing the instruction B. In order to ensure that the processor can return to the address of the instruction C correctly after executing the subroutine corresponding to the instruction B, the processor stores the address of the instruction C before executing the instruction B. That is to say that the first and second electrodes,
2) the processor writes the address of instruction C into memory. In the embodiment of the present application, the address of instruction C is referred to as a return address.
For example, the processor may write the return address to a stack in memory.
3) And the processor executes the instruction B, jumps to the address of the subprogram corresponding to the instruction B, and executes the subprogram.
4) After the subroutine is executed, the return address (0x0008) is read from the memory, and the instruction C is executed by jumping to the address 0x 0008.
When ROP attack exists, an attacker tampers the return address stored in the memory. Illustratively, an attacker modifies the address (0x0008) of instruction C stored in the memory to a malicious address by remote software. And after the processor executes the subprogram corresponding to the instruction B, reading the malicious address from the memory, and jumping to the malicious address for execution. Therefore, the purpose of destroying the integrity of the program control flow is achieved.
At present, in order to prevent the program control flow from being maliciously changed, the control flow during the operation of the program generally needs to be monitored, and if the program control flow is changed, an alarm is given.
In a related art, malicious changes to a program control flow are defended in a software manner. Specifically, in the program compiling stage, a call instruction and a return instruction corresponding to the subprogram are firstly identified. Illustratively, the call instruction is a call instruction and the return instruction is a ret instruction. An encryption instruction is then inserted before the call instruction. Illustratively, the encryption instruction is an instruction that encrypts the return address using a preset key. At the same time, a decryption instruction is inserted before the return instruction. Illustratively, the decryption instruction is an instruction that decrypts the encrypted return address using the same key.
After compiling, in the program running stage, before the processor calls the subprogram, the encryption instruction is executed to encrypt the return address of the subprogram, and the obtained encrypted address is stored in the memory. After the subprogram is executed, the decryption instruction is used for decrypting the encrypted address read from the memory to obtain an original return address, so that the processor can continue to execute from the return address.
The specific encryption and decryption method may be to define a special register in the processor, which is dedicated to storing the key and cannot be used for other purposes. And during encryption, carrying out XOR operation on the original return address and the key in the special register to obtain an encrypted return address, and storing the encrypted return address into the memory. And during decryption, performing exclusive-OR operation on the encrypted return address read from the memory and the key in the special register again to obtain the original return address. That is, the encryption instruction and the decryption instruction need to perform an exclusive-or operation.
With the above-described defense technique, even if an attacker hijacks the encrypted return address from the memory, the attacker cannot tamper the encrypted return address with the encrypted malicious address because the attacker does not know the key used by the encryption instruction. That is to say, after hijacking the program control flow, the attacker still cannot control the jump position of the program, so that the attacker can be prevented from maliciously changing the program control flow, and the integrity of the program control flow is protected.
However, in the above-described related art, there are at least the following problems: 1) during the program compiling stage, a call instruction and a return instruction need to be identified, and a plurality of additional encryption instructions and decryption instructions need to be inserted into the program, so that the running performance of the program is reduced. 2) Because the secret key is stored in the register and is encrypted and decrypted in a software mode, software stealing risks exist, and the protection safety is poor. 3) And each bit is independently operated by using XOR operation, so that the brute force cracking difficulty is low, the program can jump to a code area by brute force cracking each time, and potential safety hazards exist. 4) Since a special register is required to store the key, the special register cannot be used by other functions, so that the application scenario is limited.
In order to solve at least one of the above problems, an embodiment of the present application provides a processor, where a hardware conversion circuit is disposed in the processor, and when a return address needs to be saved, the return address is converted by using the conversion circuit, and the obtained converted return address is output to a memory; when the return address is needed, the conversion circuit is used for converting the conversion return address in the memory to obtain the return address. In the embodiment of the application, because the attacker cannot know the conversion operation performed in the conversion circuit, the attacker cannot modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, and thus, the attacker can be prevented from maliciously changing the program control flow. Moreover, because the conversion process is realized by a hardware conversion circuit in the program running process, compared with the related technology, the calling instruction and the returning instruction do not need to be identified in the compiling stage, and extra encryption instructions and decryption instructions do not need to be inserted, so that the influence on the running performance of the processor is avoided.
The technical means shown in the present application will be described in detail below with reference to specific examples. It should be noted that the following embodiments may exist alone or in combination with each other, and description of the same or similar contents is not repeated in different embodiments.
Fig. 2 is a schematic structural diagram of a processor according to an embodiment of the present application. As shown in fig. 2, the processor 100 of the present embodiment includes: processing core 110 and translation circuitry 120.
Wherein the processing core 110 is configured to output a return address.
The translation circuit 120 is configured to translate the return address output by the processing core 110 to obtain a translated return address, and output the translated return address to a stack in the memory.
The translation circuit 120 is further configured to perform the translation on the translation return address in the stack to obtain the return address when the processing core 110 needs to use the return address, and output the return address to the processing core 110.
The embodiment of the present application is not particularly limited to the type of processor. The processor in the embodiment of the present application may be, but is not limited to, the following types: a processor based on an ARM instruction set and a processor based on a RISC-V instruction set.
The embodiment of the present application does not specifically limit the bit width of the processor. The processor may be a 32-bit processor, or may be a 64-bit processor, or may be a processor having another bit width.
The number of processing cores in the processor may be one or more. A processing core refers to a processing unit in a processor for performing data processing tasks. When the number of processing cores is one, the processor is a single-core processor. When the number of processing cores is plural, the processor is a multi-core processor. The processing core is used for outputting a return address. The return address is the address of the next instruction to be executed by the processing core.
In this application, a processing core is any circuit that needs to store a return address into a memory and needs to obtain a return address from the memory. Illustratively, the processing core is a Program Counter (PC). The PC may also be referred to as an instruction counter. The PC is used for storing the address of the next instruction to be executed of the processor. Before the program starts to execute, the processor sends the start address of the program, i.e. the address of the first instruction of the program, to the PC. When executing an instruction, the processor will automatically modify the value in the PC, i.e. every time an instruction is executed, increase the value in the PC by an amount such that the value in its PC always points to the address of the next instruction to be executed.
In the present application, the translation circuit is disposed on the data path between the processing core and the memory. The processing of the return address by the translation circuit is described below in conjunction with fig. 3.
Fig. 3 is a schematic diagram of a processing procedure of a return address according to an embodiment of the present application. As shown in fig. 3, when the processing core needs to store the return address into the memory, the return address output by the processing core passes through the conversion circuit, the conversion circuit converts the return address to obtain a converted return address, and then the converted return address is output to the stack in the memory. When the processing core needs to use the return address, the converted return address popped from the memory passes through the conversion circuit, the conversion circuit performs the same conversion on the converted address to obtain an original return address, and then the original return address is output to the processing core.
The processing core needs to store the return address into the memory, which may mean that the processing core needs to store the return address corresponding to the call instruction into the memory before executing the call instruction.
Correspondingly, the processing core needs to use a return address, which may mean that the processing core needs to obtain the return address corresponding to the call instruction from the memory when returning from the subroutine corresponding to the call instruction.
It is understood that in the embodiments of the present application, the translation circuit may employ one or more translation models to translate the return address. This embodiment is not particularly limited, and several possible conversion manners may be referred to in the detailed description of the following embodiments.
In the application, the conversion circuit converts the return address once before the return address is stored in the memory, so that the converted return address is stored in the memory. Because the attacker can not know the conversion operation in the conversion circuit, the attacker can not modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, and thus, the attacker can be prevented from maliciously changing the program control flow.
After the converted return address is popped from the memory, the conversion circuit converts the converted return address again to obtain an original return address, so that the processing core can execute subsequent instructions according to the original return address, and the integrity of the program control flow is ensured.
In the application, because the conversion process is realized by the hardware conversion circuit in the program running process, compared with the related technology, the calling instruction and the returning instruction do not need to be identified in the compiling stage, and extra encryption instructions and decryption instructions do not need to be inserted, so that the influence on the running performance of the processor is avoided. Meanwhile, compared with the related technology, a special register is not needed for storing the key, and the software stealing risk is avoided.
In one possible implementation, the processor further includes a control circuit, and the control circuit is capable of identifying whether the address output by the processing core is a return address, and controlling the return address to be input into the conversion circuit when the control circuit identifies that the address output by the processing core is the return address. The conversion circuit converts the return address to obtain a conversion return address, and outputs the conversion return address to a stack in the memory. When the control circuitry identifies that the processing core needs to use the return address, the translated return address, which is popped from memory, is input to the translation circuitry under the control of the control circuitry. The conversion circuit converts the converted return address to obtain an original return address, and outputs the original return address to the processing core.
In another possible implementation, the processor does not sense the presence of the conversion circuit. The translation circuit is arranged on a data path between the processing core and the memory, i.e. when the return address is transferred from the processing core to the memory and from the memory to the processing core, the return address passes through the translation circuit. The implementation method only needs to arrange the conversion circuit on a data path between the processing core and the memory, does not need to change the existing control flow of the processor, and is easy to implement.
Optionally, the processor further includes a register, and the register is a register dedicated to storing the return address.
For this type of processor, the registers are the requisite path for return addresses from the processing core to the memory. For example, when the return address of the subroutine needs to be output to the memory, the processing core first outputs the return address to the register, and then the register outputs the return address to the memory. When the return address needs to be used, the return address popped from the memory is also output to the register, and then the register outputs the return address to the processing core.
It will be appreciated that the register used to store the return address may be different for processors of different instruction sets.
Illustratively, the processor is an ARM instruction set based processor and the register is a Link Register (LR).
Illustratively, the processor is a RISC V instruction set based processor and the register is a Return Address (RA) register.
The structure of the processor and the processing procedure of the return address are described below by taking the processor of the ARM instruction set as an example. Fig. 4A and 4B are schematic diagrams illustrating a conventional return address processing procedure. Fig. 4A illustrates a procedure diagram of pushing a return address, and fig. 4B illustrates a procedure diagram of pushing a return address.
Before the processing core executes the call instruction, a return address corresponding to the call instruction needs to be stored in the memory. Illustratively, as shown in FIG. 4A, under control of the processing core, the return address output by the PC enters the LR register. The return address output by the LR register is then stored on the stack in memory.
When the processing core returns from the subroutine corresponding to the call instruction, the return address needs to be read from the memory. Illustratively, as shown in FIG. 4B, the LR register is entered with a return address that is popped off the stack from memory. The LR register then outputs the return address to the processing core.
The conversion circuit in this embodiment may be disposed before the register or disposed after the register.
It should be noted that, in this embodiment, the conversion circuit is disposed before the register, which means that the conversion circuit is disposed on a write path of the register, that is, disposed at an input end of the register. When the translation circuitry is placed before the register, it means that all return addresses entering the register will first go through the translation circuitry and then enter the register. In the present embodiment, the conversion circuit is disposed behind the register, which means that the conversion circuit is disposed on the readout path of the register, i.e., at the output terminal of the register. When the translation circuit is placed after the register, it means that all return addresses output from the register pass through the translation circuit.
The above two embodiments are described below separately.
Fig. 5A and 5B are schematic diagrams of a return address processing procedure according to an embodiment of the present application. In this embodiment, the conversion circuit is provided on the write path of the LR register. Fig. 5A illustrates a procedure of pushing a return address, and fig. 5B illustrates a procedure of pushing a return address.
The translation circuit is specifically configured to translate a return address output by the processing core to obtain a translated return address, and output the translated return address to the register, so that the translated return address is output to a stack in the memory via the register.
The translation circuit is further specifically configured to, when the processing core needs to use the return address, perform the translation on the translation return address in the stack to obtain the return address, and output the return address to the register, so that the return address is output to the processing core via the register.
As shown in fig. 5A, since the conversion circuit is provided on the write path of the register, the return address output by the processing core passes through the conversion circuit first in the process of being output to the LR register. The translation circuit translates the return address output by the processing core to obtain a translated return address, such that what is actually stored in the LR register is the translated return address. The LR register then outputs the translated return address to the stack in memory, i.e., it is the translated return address that is actually pushed onto the stack. Because the attacker can not know the conversion operation in the conversion circuit, the attacker can not modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, and thus, the attacker can be prevented from maliciously changing the program control flow.
As shown in fig. 5B, when the processing core needs to use the return address, since the translation circuit is disposed on the write path of the LR register, the translated return address popped from the memory passes through the translation circuit before entering the LR register. The translation circuit translates the translated return address to obtain the original return address, such that what is actually stored in the LR register is the original return address. The LR register then outputs the original return address to the processing core. Thereby ensuring normal control flow of the program.
In the following, a processor based on an ARM instruction set is taken as an example, and an operation process of an actual program is combined for illustration.
Fig. 6 is a schematic diagram of a program running process provided in an embodiment of the present application. As shown in fig. 6, the program includes the following ARM instructions: SUB, STP, ADD, LDP, ADD, and RET.
The control flow is ready to enter the subroutine, i.e., the processing core outputs a return address to the LR (0x30) register when executing STP instructions. Before the return address enters the LR register, the return address passes through the conversion circuit, and the conversion circuit converts the return address to obtain a converted return address, so that the converted return address actually enters the LR register. The LR register then outputs the translated return address to the stack in memory.
When the control flow returns from the end of the subroutine, i.e., when the processing core executes the LDP instruction, the translated return address is read from the stack of the memory. Before entering the LR (0x30) register, the translated return address passes through the translation circuit, which translates the translated return address again to obtain the original return address. This original return address is stored in the LR register and used by the processing core when executing the RET instruction.
In this embodiment, a hardware conversion circuit is provided on the write path of the LR register, and the conversion circuit is configured to convert the return address entering the LR register, that is, all inputs that need to enter the LR register are converted by the conversion circuit, and then the conversion result is input into the LR register. By the method, the return address can be automatically identified, the control flow of the processor does not need to be changed, and the method is easy to implement.
Fig. 7A and 7B are schematic diagrams of a return address processing procedure according to an embodiment of the present application. In this embodiment, the conversion circuit is provided on the readout path of the LR register. Fig. 7A illustrates a procedure of pushing a return address, and fig. 7B illustrates a procedure of pushing a return address.
And when the processing core outputs the return address, the register is used for registering the return address output by the processing core. The conversion circuit is specifically configured to convert the return address output by the register to obtain a converted return address, and output the converted return address to a stack in a memory.
The register is further configured to register the translated return address output by the stack when the return address needs to be used by the processing core. The conversion circuit is further specifically configured to perform the conversion on the converted return address output by the register to obtain the return address, and output the return address to the processing core.
As shown in fig. 7A, the return address output by the processing core is output to the LR register. Since the translation circuit is provided on the readout path of the LR register, the LR register passes through the translation circuit in outputting the return address to the stack in the memory. The translation circuit translates the return address output by the LR register to obtain a translated return address, such that the translated return address is actually pushed onto the stack. Because the attacker can not know the conversion operation in the conversion circuit, the attacker can not modify the conversion return address in the memory into the conversion return address corresponding to the malicious instruction, and thus, the attacker can be prevented from maliciously changing the program control flow.
When the processing core needs to use the return address, the translated return address popped from memory into the LR register, as shown in fig. 7B. Since the conversion circuit is provided on the readout path of the LR register, the LR register passes through the conversion circuit in outputting the converted return address to the processing core. The translation circuit translates the translated return address to obtain the original return address such that the original return address is actually input to the processing core. Thereby ensuring normal control flow of the program.
In this embodiment, a hardware conversion circuit for converting the return address output from the LR register, that is, all the values output from the LR register are converted by the conversion circuit first, is provided in the readout path of the LR register. By the method, the return address can be automatically identified, the control flow of the processor does not need to be changed, and the method is easy to implement.
It should be noted that, in the above embodiment, a processor based on an ARM instruction set is taken as an example for description. The processor to which embodiments of the application are adapted is not limited to this. The embodiment of the application is also applicable to processors of other instruction sets as long as the register special for storing the return address exists in the processor.
It should be noted that, in the above embodiments, for the processor based on the ARM instruction set and the processor based on the RISC-V instruction set, if the compiler supports the end branch (leaf sub) optimization option (i.e. the return address is not saved from the LR/RA register to the stack in the memory in the leaf sub but is saved in the LR/RA register all the time), it is necessary to turn off the optimization function, and it is ensured that the LR/RA is used in the leaf sub in a manner that is not different from that in the normal branch (sub). Otherwise, the return address is not decrypted when it is read from the LR/RA register, which may affect the normal return of the subroutine.
The following describes a specific conversion process of the conversion circuit in each of the above embodiments.
In the embodiment of the present application, the conversion performed by the conversion circuit on the return address satisfies the following conditions:
B=IP(A),A=IP(B)
wherein A is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
It will be appreciated that the transformation model satisfying the above conditions may be of various types, including but not limited to: an exclusive-or conversion model, a modular multiplication conversion model, and a modular addition conversion model.
In one possible embodiment, a plurality of conversion models are stored in the conversion circuit, and different conversion models correspond to one or more sets of selectable conversion parameters. By storing the conversion model and the conversion parameters in the hardware circuit, an attacker cannot acquire the sensitive information, and the defense reliability of the program control flow is improved.
When the conversion circuit converts the return address, at least one bit of the return address is converted by adopting at least one conversion model to obtain the converted return address.
Wherein the at least one bit comprises a bit corresponding to a changed section and a bit corresponding to an unchanged section of a code address of the program; the code address of the program comprises instruction addresses of a plurality of instructions, the invariable section is a bit with the same bit in the instruction addresses, and the variable section is a bit with different bit in the instruction addresses.
For example, taking libc as an example of a code area, comparing the start address and the end address of the code address of a program can find that several high-order bits, even dozens or tens of high-order bits of the code address are not changed due to the limited number of instructions in the program. These unchanging bits are referred to herein as the unchanging sections of the code address. Accordingly, the bits of the code address that change are called changed sections of the code address.
In the embodiment of the application, when the conversion circuit converts at least one bit of the return address, the conversion circuit simultaneously converts the variable section and the invariable section, so that the brute force cracking difficulty of an attacker can be improved, and the safety of the program control flow is ensured.
For example, before the program is run, the conversion circuit randomly selects a conversion model and randomly selects a set of conversion parameters to convert the return address in the program running process. If the program operation is wrong, namely under the condition of ROP attack, the conversion model and/or the conversion parameters are replaced when the program is operated again, so that an attacker cannot attack the same conversion model and the same conversion parameters for multiple times, and the safety of the program control flow is further improved.
The processing of return addresses by a processor is described below in conjunction with several specific embodiments.
In one possible implementation, assuming that the processor is a 32-bit ARM instruction set based processor, the conversion circuit is disposed on the write path of the LR register, and a modular multiplication conversion model is employed in the conversion circuit. The return address is processed by the processor as follows:
1) the return address is stored in the LR register before the processing core executes a call instruction (e.g., calls a func function). Since the translation circuit is disposed on the write path of the register, the return address is first translated by the translation circuit to obtain a translated return address, so that what actually enters the LR register is the translated return address.
For convenience of description, the return address (i.e., the original return address) output by the processing core is denoted as a [31:0] in this embodiment. The translated return address translated by the translation circuitry (i.e., the return address actually entered into the LR register) is denoted as b [31:0 ]. The conversion circuit performs a modular multiplication operation as follows:
a×q≡b mod p
wherein, p and q are conversion parameters corresponding to the modular multiplication conversion model. From the above equation, the return address a is modulo-multiplied by the parameter q in the conversion circuit, modulo p, and the output is the converted return address b. Wherein q and p satisfy the following relationship:
q2≡1mod p
it will be appreciated that there are many combinations of p and q that satisfy the above relationship. For example: if p is 2 {32}, then q can be selected to be any of the following:
4294967295,2147483649,2147483647
in practical applications, p should be larger than the maximum code address but should not be too large, so as to save the chip area of the conversion circuit. In addition, when q is p-1 or q is 1, the above requirement is always satisfied, but q is not selected to be 1 for safety.
In this embodiment, the modular multiplication conversion model is implemented by performing an overall operation on the 32-bit return address a [31:0], that is, performing an operation on the changed section and the unchanged section of the code address at the same time, so that an attacker cannot perform a specific attack only on the changed section and cannot ensure that the attacker can jump to the code area every time of brute force cracking, thereby improving the security of the program control flow.
2) The translated return address is read from the LR register and stored in the stack of memory.
3) The function func is executed.
4) Before returning from the function func, the translated return address is read from memory and the popped translated return address is stored in the LR register. Because the LR register has a translation circuit on the write path, the translated return address popped off the stack will first go through the translation circuit. The translation circuit translates the translated return address once more to obtain the original return address, which is then stored in the LR register. The conversion circuit performs the same modular multiplication operation as follows:
b×q≡a mod p
wherein, p and q are conversion parameters corresponding to the modular multiplication conversion model, and are the same as the parameters adopted in the first conversion process. The translation return address b is the input to the translation model and the return address a is the output of the translation model.
From the above two conversion processes, the return address is converted back to the original value after two times of the same conversion.
5) A return operation is performed.
If the converted return address b is not maliciously tampered in the stack of the memory, the return address a obtained after the conversion by the conversion circuit is a correct return address, and the program is normally executed. If the translated return address b is maliciously tampered in the stack of the memory, since an attacker cannot know the sensitive information p and q and cannot construct a legal maliciously translated return address, the return address a obtained after the translation by the translation circuit is a messy code, and when a return operation is executed, a program executes a code at the messy code address, thereby generating memory errors at a large probability. Thus, the attacker cannot achieve the purpose of changing the program control flow.
In another possible implementation, assuming that the processor is a 32-bit ARM instruction set based processor, a conversion circuit is provided on the readout path of the LR register, and an analog-to-addition conversion model is employed in the conversion circuit. The return address is processed by the processor as follows:
1) the return address is stored in the LR register before the processing core executes a call instruction (e.g., calls a func function).
2) The return address is read from the LR register and stored in the stack of memory.
Because the conversion circuit is arranged on the read path of the LR register, the return address output by the LR register is converted by the conversion circuit to obtain a conversion return address, so that the actual pushed address is the conversion return address.
For convenience of description, the return address (i.e., the original return address) output by the LR register is denoted as a [31:0] in this embodiment. The translated return address translated by the translation circuitry (i.e., the actual push return address) is denoted as b [31:0 ]. The conversion circuit performs a modulo addition operation as shown in the following equation:
a+q≡b mod p
wherein, p and q are conversion parameters corresponding to the modulo addition conversion model. From the above equation, the return address a and the parameter q in the conversion circuit are modulo-added, modulo p, and the output is the converted return address b. Wherein q and p satisfy the following relationship:
q≡p/2
it will be appreciated that there are many combinations of p and q that satisfy the above relationship. For example: if p is 2 {32} +2, then q is 2 {31} + 1.
In practical applications, p should be larger than the maximum code address but should not be too large, so as to save the chip area of the conversion circuit.
In this embodiment, since the modulo addition conversion model performs an integral operation on the 32-bit return address a [31:0], that is, the variable section and the invariant section of the code address are simultaneously operated, an attacker cannot perform a specific attack only on the variable section, and cannot ensure that the attacker can jump to the code area every time of brute force cracking, thereby improving the security of the program control flow.
3) The function func is executed.
4) Before returning from the function func, the translated return address is read from memory and the popped translated return address is stored in the LR register.
5) The translated return address is read out of the LR register and output to the processing core, and a return operation is performed.
Since the translation circuit is provided on the readout path of the LR register, the translated return address popped off the stack will first pass through the translation circuit. The translation circuit translates the translated return address once more to obtain the original return address, which is then output to the processing core. The conversion circuit performs the same modulo addition operation as follows:
b+q≡a mod p
wherein, p and q are conversion parameters corresponding to the modulo addition conversion model, and are the same as the parameters adopted in the first conversion process. The translation return address b is the input to the translation model and the return address a is the output of the translation model.
From the above two conversion processes, the return address is converted back to the original value after two times of the same conversion.
If the converted return address b is not maliciously tampered in the stack of the memory, the return address a obtained after the conversion by the conversion circuit is a correct return address, and the program is normally executed. If the translated return address b is maliciously tampered in the stack of the memory, since an attacker cannot know the sensitive information p and q and cannot construct a legal maliciously translated return address, the return address a obtained after the translation by the translation circuit is a messy code, and when a return operation is executed, a program executes a code at the messy code address, thereby generating memory errors at a large probability. Thus, the attacker cannot achieve the purpose of changing the program control flow.
In another possible implementation, the conversion circuit groups at least one bit of the return address to obtain a plurality of bit groups; converting the bits in each bit group by using the conversion model to obtain a conversion result corresponding to each bit group, wherein at least two bit groups in the plurality of bit groups have different conversion models, or the bit groups have the same conversion model; and then, obtaining the conversion return address according to the conversion result corresponding to each bit group.
It is understood that there are various ways to group at least one bit of the return address, and this embodiment is not limited in this respect. Taking a 32-bit processor system as an example, the 32 bits of the return address may be divided into two bit groups, or may be divided into three bit groups, or may be divided into more bit groups.
For example, when divided into two bit groups, the first 16 bits may be grouped and the last 16 bits may be grouped; or the first 8 bits can be used as a group, and the last 24 bits can be used as a group; it is also possible to group odd bits and group even bits. It is understood that other groupings exist, not listed here. When divided into two bit groups, the conversion models used by the two bit groups may be the same or different. For example: and the two bit groups adopt a modular multiplication conversion model, or the two bit groups adopt a modular addition conversion model, or one bit group adopts a modular multiplication conversion model and one bit group adopts a modular addition conversion model.
For example, when divided into three bit groups, the first 8 bits may be grouped, the middle 16 bits may be grouped, and the last 8 bits may be grouped; or the first 10 bits may be used as a group, the middle 8 bits may be used as a group, and the last 14 bits may be used as a group; it is also possible to set 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31 th bits as a group, 2, 5, 8, 11, 14, 17, 20, 23, 26, 29 th bits as a group, and 0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30 th bits as a group. It is understood that other groupings exist, not listed here. When divided into three bit groups, the conversion models used by the three bit groups may be the same or different.
The following description is given by way of example.
Assuming the processor is a 32-bit ARM instruction set based processor, the translation circuit is placed on the write path of the LR register. The conversion circuit divides 32 bits of the return address into two bit groups, wherein one bit group comprises bits corresponding to odd-numbered bits of the return address, and the other bit group comprises bits corresponding to even-numbered bits of the return address. One of the bit groups adopts a modular multiplication conversion model, and the other bit group adopts a modular addition conversion model. The processor processes the return address as follows:
1) the return address is stored in the LR register before the processing core executes a call instruction (e.g., calls a func function). Since the translation circuit is disposed on the write path of the register, the return address is first translated by the translation circuit to obtain a translated return address, so that what actually enters the LR register is the translated return address.
For convenience of description, the return addresses (i.e. original return addresses) output by the processing cores are denoted as a [31:0] in this embodiment]. A [31:0]]Divided into two bit groups, each a1[15:0]And a2[15:0]Wherein, in the step (A),
a1[15:0]≡{a[31],a[29],a[27],…,a[1]}
a2[15:0]≡{a[30],a[28],a[26],…,a[0]}
correspondingly, the translated return address translated by the translation circuitry (i.e., the return address actually entered into the LR register) is denoted as b [31:0]]. B [31:0]]Divided into two bit groups, respectively b1[15:0]And b2[15:0]Wherein, in the step (A),
b1[15:0]≡{b[31],b[29],b[27],…,b[1]}
b2[15:0]≡{b[30],b[28],b[26],…,b[0]}
when the conversion circuit converts the return address, the conversion circuit converts the return address to the address a1[15:0]Using a modular multiplication conversion model, pair a2[15:0]The modulo addition conversion model is used, as follows:
a1×q1≡b1mod p1
a2+q2≡b2modp2
wherein p is1And q is1For conversion parameters, p, corresponding to the model-by-module conversion model2And q is2And converting the conversion parameters corresponding to the model addition conversion model. According to the formula, the compound has the advantages of,a1and the parameter q in the conversion circuit1Performing modulo multiplication with modulo p1Output is b1。a2And the parameter q in the conversion circuit2Performing modulo addition with modulo p2Output is b2
Wherein p is1And q is1The following relationship is satisfied:
q1 2≡1mod p1
it will be appreciated that p satisfies the above relationship1And q is1Various combinations are possible. For example: if p is12^ {16} +8, then q1May be selected as any of the following:
65543,60083,54619,49159,…,5461
p2and q is2The following relationship is satisfied:
q2≡p2/2
it will be appreciated that p satisfies the above relationship2And q is2Various combinations are possible. For example: if p is22^ {32} +6, then q2=2^{31}+3。
In practice, p is1And p2Should be larger than the maximum code address but should not be too large to save chip area of the conversion circuit.
In this embodiment, for a 32-bit return address a, 32 bits are divided into two groups according to odd-numbered bits and even-numbered bits, where one group uses a modular multiplication conversion model and the other group uses a modular addition conversion model. Therefore, the variable section and the invariable section of the code address are operated simultaneously, so that an attacker cannot carry out specific attack only on the variable section, cannot ensure that the attacker can jump to the code area every time of brute force cracking, and improves the safety of the program control flow.
2) The translated return address is read from the LR register and stored in the stack of memory.
3) The function func is executed.
4) Before returning from the function func, the translated return address is read from memory and the popped translated return address is stored in the LR register. Because the LR register has a translation circuit on the write path, the translated return address popped off the stack will first go through the translation circuit. The translation circuit translates the translated return address once more to obtain the original return address, which is then stored in the LR register.
When the conversion circuit converts the conversion return address, b is converted1[15:0]Conversion model by modular multiplication, pair b2[15:0]The modulo addition conversion model is used, as follows:
b1×q1≡a1mod p1
b2+q2≡a2mod p2
wherein p is1And q is1The conversion parameters corresponding to the modular multiplication conversion model are the same as those adopted in the first conversion process. p is a radical of2And q is2The conversion parameters corresponding to the modulo addition conversion model are the same as those adopted in the first conversion process.
From the above two conversion processes, the return address is converted back to the original value after two times of the same conversion.
5) A return operation is performed.
If the converted return address b is not maliciously tampered in the stack of the memory, the return address a obtained after the conversion by the conversion circuit is a correct return address, and the program is normally executed. If the translation return address b is maliciously tampered in the stack of the memory, an attacker cannot know the sensitive information p1、q1、p2And q is2Therefore, the return address a obtained after the conversion by the conversion circuit is a garbled code, and when the return operation is executed, the program will execute the code at the garbled address, which may generate memory errors at a high rate. Thus, the attacker cannot achieve the purpose of changing the program control flow.
Fig. 8 is a flowchart illustrating a return address processing method according to an embodiment of the present application. The method of the embodiment is executed by a processor, wherein the processor comprises: a processing core and a translation circuit. As shown in fig. 8, the method of the present embodiment includes:
s801: and when the processing core outputs the return address, the return address is converted by the conversion circuit to obtain a conversion return address, and the conversion return address is output to a stack in the memory.
S802: when the return address needs to be used, the conversion circuit performs the conversion on the conversion return address in the stack to obtain the return address, and the return address is output to the processing core.
In one possible implementation, the processor further includes a register, and the translating the return address by the translation circuit to obtain a translated return address and outputting the translated return address to a stack in the memory includes:
converting the return address through the conversion circuit to obtain a conversion return address, and outputting the conversion return address to the register so that the conversion return address is output to a stack in the memory through the register;
the converting, by the conversion circuit, the converted return address in the stack to obtain the return address, and outputting the return address to the processing core includes:
and performing the conversion on the conversion return address in the stack through the conversion circuit to obtain the return address, and outputting the return address to the register so that the return address is output to the processing core through the register.
In one possible implementation, the processor further includes a register to which the return address is output by the processing core, the translating the return address by the translation circuitry to obtain a translated return address, and outputting the translated return address to a stack in memory, including:
converting the return address output by the register through the conversion circuit to obtain a conversion return address, and outputting the conversion return address to a stack in a memory;
the converting, by the conversion circuit, the converted return address in the stack to obtain the return address, and outputting the return address to the processing core includes:
and performing the conversion on the conversion return address output by the register through the conversion circuit to obtain the return address, and outputting the return address to the processing core.
In one possible embodiment, the transformation satisfies the following condition:
B=IP(A),A=IP(B)
wherein A is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
In one possible embodiment, the translating the return address to obtain a translated return address includes:
and converting at least one bit of the return address by adopting at least one conversion model to obtain the converted return address.
In a possible embodiment, the converting at least one bit of the return address by using at least one conversion model to obtain the converted return address includes:
grouping at least one bit of the return address to obtain a plurality of bit groups;
converting the bits in each bit group by using the conversion model to obtain a conversion result corresponding to each bit group, wherein at least two bit groups in the plurality of bit groups have different conversion models, or the bit groups have the same conversion model;
and obtaining the conversion return address according to the conversion result corresponding to each bit group.
In a possible implementation manner, the number of the bit groups is two, where one bit group includes bits corresponding to odd bits of the return address, and the other bit group includes bits corresponding to even bits of the return address.
In a possible implementation, the class of the transformation model includes: a modular multiplication conversion model and a modular addition conversion model.
In one possible implementation, the register is a register for storing a return address.
In one possible implementation, the processor is an ARM instruction set based processor and the register is an LR register.
In one possible implementation, the processor is a RISC V instruction set based processor and the register is an RA register.
The method for processing a return address provided in this embodiment may be applied to the processor described in any of the above embodiments, and the implementation principle and technical effect are similar, which are not described herein again.
An embodiment of the present application further provides an electronic device, including: the processor may adopt the structure of the processor in any of the above embodiments, and the implementation principle and technical effect are similar, which is not described herein again.
An embodiment of the present application further provides a chip, including: the processor may adopt the structure of the processor in any of the above embodiments, and the implementation principle and technical effect are similar, which are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules is only one logical division, and other divisions may be realized in practice, for example, a plurality of modules may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing unit, or each module may exist alone physically, or two or more modules are integrated into one unit. The unit formed by the modules can be realized in a hardware form, and can also be realized in a form of hardware and a software functional unit.
The integrated module implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present application.
It should be understood that the Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in the incorporated application may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor.
The memory may comprise a high-speed RAM memory, and may further comprise a non-volatile storage NVM, such as at least one disk memory, and may also be a usb disk, a removable hard disk, a read-only memory, a magnetic or optical disk, etc.
The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.
The storage medium may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an Application Specific Integrated Circuits (ASIC). Of course, the processor and the storage medium may reside as discrete components in an electronic device or host device.

Claims (23)

1. A processor, comprising: a processing core and a conversion circuit;
the processing core is used for outputting a return address;
the conversion circuit is used for converting the return address output by the processing core to obtain a conversion return address and outputting the conversion return address to a stack in the memory;
the conversion circuit is further configured to perform the conversion on the conversion return address in the stack to obtain the return address when the processing core needs to use the return address, and output the return address to the processing core.
2. The processor of claim 1, further comprising a register;
the translation circuit is specifically configured to translate a return address output by the processing core to obtain a translation return address, and output the translation return address to the register, so that the translation return address is output to a stack in the memory via the register;
the translation circuit is further specifically configured to, when the processing core needs to use the return address, perform the translation on the translation return address in the stack to obtain the return address, and output the return address to the register, so that the return address is output to the processing core via the register.
3. The processor of claim 1, further comprising a register;
when the processing core outputs the return address, the register is used for registering the return address output by the processing core;
the conversion circuit is specifically configured to convert the return address output by the register to obtain a conversion return address, and output the conversion return address to a stack in a memory;
when the processing core needs to use the return address, the register is further used for registering the conversion return address output by the stack;
the conversion circuit is further specifically configured to perform the conversion on the converted return address output by the register to obtain the return address, and output the return address to the processing core.
4. A processor according to claim 2 or 3, wherein the conversion satisfies the condition:
B=IP(A),A=IP(B)
wherein A is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
5. The processor of claim 4, wherein the conversion circuit is specifically configured to:
and converting at least one bit of the return address by adopting at least one conversion model to obtain the converted return address.
6. The processor of claim 5, wherein the conversion circuit is specifically configured to:
grouping at least one bit of the return address to obtain a plurality of bit groups;
converting the bits in each bit group by using the conversion model to obtain a conversion result corresponding to each bit group, wherein at least two bit groups in the plurality of bit groups have different conversion models, or the bit groups have the same conversion model;
and obtaining the conversion return address according to the conversion result corresponding to each bit group.
7. The processor of claim 6, wherein the number of the bit groups is two, and wherein one bit group comprises bits corresponding to odd bits of the return address, and wherein another bit group comprises bits corresponding to even bits of the return address.
8. The processor according to any one of claims 4 to 7, wherein the class of the transformation model comprises: a modular multiplication conversion model and a modular addition conversion model.
9. The processor of any of claims 2 to 8, wherein the register is a register for storing a return address.
10. The processor of claim 9, wherein the processor is an ARM instruction set based processor and the register is an LR register.
11. The processor of claim 9, wherein the processor is a RISC V instruction set based processor and the register is an RA register.
12. A method for processing a return address, applied to a processor, the processor comprising: a processing core and a translation circuit, the method comprising:
when the processing core outputs the return address, the return address is converted through the conversion circuit to obtain a conversion return address, and the conversion return address is output to a stack in a memory;
when the return address needs to be used, the conversion circuit performs the conversion on the conversion return address in the stack to obtain the return address, and the return address is output to the processing core.
13. The method of claim 12, wherein the processor further comprises a register, and wherein translating, by the translation circuitry, the return address to obtain a translated return address and outputting the translated return address to a stack in memory comprises:
converting the return address through the conversion circuit to obtain a conversion return address, and outputting the conversion return address to the register so that the conversion return address is output to a stack in the memory through the register;
the converting, by the conversion circuit, the converted return address in the stack to obtain the return address, and outputting the return address to the processing core includes:
and performing the conversion on the conversion return address in the stack through the conversion circuit to obtain the return address, and outputting the return address to the register so that the return address is output to the processing core through the register.
14. The method of claim 12, wherein the processor further comprises a register to which the return address is output by the processing core, wherein translating the return address by the translation circuitry to obtain a translated return address and outputting the translated return address to a stack in memory comprises:
converting the return address output by the register through the conversion circuit to obtain a conversion return address, and outputting the conversion return address to a stack in a memory;
the converting, by the conversion circuit, the converted return address in the stack to obtain the return address, and outputting the return address to the processing core includes:
and performing the conversion on the conversion return address output by the register through the conversion circuit to obtain the return address, and outputting the return address to the processing core.
15. The method according to claim 13 or 14, wherein the conversion satisfies the following condition:
B=IP(A),A=IP(B)
wherein A is the return address, B is the translated return address, and IP () is the translation model employed by the translation.
16. The method of claim 15, wherein translating the return address to obtain a translated return address comprises:
and converting at least one bit of the return address by adopting at least one conversion model to obtain the converted return address.
17. The method of claim 16, wherein said converting at least one bit of the return address using at least one conversion model to obtain the converted return address comprises:
grouping at least one bit of the return address to obtain a plurality of bit groups;
converting the bits in each bit group by using the conversion model to obtain a conversion result corresponding to each bit group, wherein at least two bit groups in the plurality of bit groups have different conversion models, or the bit groups have the same conversion model;
and obtaining the conversion return address according to the conversion result corresponding to each bit group.
18. The method of claim 17, wherein the number of bit groups is two, and wherein one bit group comprises bits corresponding to odd bits of the return address, and wherein another bit group comprises bits corresponding to even bits of the return address.
19. The method according to any of claims 15 to 18, wherein the classes of transformation models comprise: a modular multiplication conversion model and a modular addition conversion model.
20. A method according to any one of claims 13 to 19, wherein the register is a register for storing a return address.
21. The method of claim 20, wherein the processor is an ARM instruction set based processor and the register is an LR register.
22. The method of claim 20, wherein the processor is a RISC V instruction set based processor and the register is an RA register.
23. An electronic device comprising a processor as claimed in any one of claims 1 to 11.
CN201910586325.5A 2019-07-01 2019-07-01 Processor and return address processing method Active CN112181491B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910586325.5A CN112181491B (en) 2019-07-01 2019-07-01 Processor and return address processing method
PCT/CN2020/099168 WO2021000847A1 (en) 2019-07-01 2020-06-30 Processor and return address processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910586325.5A CN112181491B (en) 2019-07-01 2019-07-01 Processor and return address processing method

Publications (2)

Publication Number Publication Date
CN112181491A true CN112181491A (en) 2021-01-05
CN112181491B CN112181491B (en) 2024-09-24

Family

ID=73915579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910586325.5A Active CN112181491B (en) 2019-07-01 2019-07-01 Processor and return address processing method

Country Status (2)

Country Link
CN (1) CN112181491B (en)
WO (1) WO2021000847A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170140148A1 (en) * 2015-11-12 2017-05-18 Samsung Electronics Co., Ltd. Method and apparatus for protecting kernel control-flow integrity using static binary instrumentation
CN107077336A (en) * 2014-12-18 2017-08-18 英特尔公司 binary conversion mechanism
CN107710151A (en) * 2015-06-24 2018-02-16 英特尔公司 The technology that shadow storehouse for binary file converting system manipulates
CN107925690A (en) * 2015-09-30 2018-04-17 英特尔公司 Indicate the control transfer instruction of calling or the intention returned
CN109409085A (en) * 2018-09-21 2019-03-01 中国科学院信息工程研究所 The method and device that return address is tampered in processing storehouse
CN109858253A (en) * 2019-01-08 2019-06-07 中国人民解放军战略支援部队信息工程大学 Stack buffer overflow attack defence method based on LBR

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7086088B2 (en) * 2002-05-15 2006-08-01 Nokia, Inc. Preventing stack buffer overflow attacks
US20140173290A1 (en) * 2012-12-17 2014-06-19 Advanced Micro Devices, Inc. Return address tracking mechanism
US9037872B2 (en) * 2012-12-17 2015-05-19 Advanced Micro Devices, Inc. Hardware based return pointer encryption
DE102015113468A1 (en) * 2015-08-14 2017-02-16 Infineon Technologies Ag DATA PROCESSING DEVICE AND METHOD FOR SECURING A DATA PROCESSING AGAINST ATTACKS
CN106022166B (en) * 2016-06-02 2018-10-23 东北大学 A kind of code reuse attack defending system and method
CN109361507B (en) * 2018-10-11 2021-11-02 杭州华澜微电子股份有限公司 Data encryption method and encryption equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107077336A (en) * 2014-12-18 2017-08-18 英特尔公司 binary conversion mechanism
CN107710151A (en) * 2015-06-24 2018-02-16 英特尔公司 The technology that shadow storehouse for binary file converting system manipulates
CN107925690A (en) * 2015-09-30 2018-04-17 英特尔公司 Indicate the control transfer instruction of calling or the intention returned
US20170140148A1 (en) * 2015-11-12 2017-05-18 Samsung Electronics Co., Ltd. Method and apparatus for protecting kernel control-flow integrity using static binary instrumentation
CN109409085A (en) * 2018-09-21 2019-03-01 中国科学院信息工程研究所 The method and device that return address is tampered in processing storehouse
CN109858253A (en) * 2019-01-08 2019-06-07 中国人民解放军战略支援部队信息工程大学 Stack buffer overflow attack defence method based on LBR

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王心然;刘宇涛;陈海波;: "基于IPT硬件的内核模块ROP透明保护机制", 软件学报, no. 05, 11 January 2018 (2018-01-11) *

Also Published As

Publication number Publication date
CN112181491B (en) 2024-09-24
WO2021000847A1 (en) 2021-01-07

Similar Documents

Publication Publication Date Title
US10116666B2 (en) Secure debug trace messages for production authenticated code modules
US8583880B2 (en) Method for secure data reading and data handling system
US10678707B2 (en) Data processing device and method for cryptographic processing of data
CN100356342C (en) Information processing unit
US11232194B2 (en) Method for executing a binary code of a secure function with a microprocessor
Nashimoto et al. Buffer overflow attack with multiple fault injection and a proven countermeasure
US20200110906A1 (en) Encryption circuit for performing virtual encryption operations
CN113673002B (en) Memory overflow defense method based on pointer encryption mechanism and RISC-V coprocessor
US9405936B2 (en) Code integrity protection by computing target addresses from checksums
US10572666B2 (en) Return-oriented programming mitigation
US20080034264A1 (en) Dynamic redundancy checker against fault injection
US20070083770A1 (en) System and method for foiling code-injection attacks in a computing device
CN112181491B (en) Processor and return address processing method
US20200272475A1 (en) Method for executing a machine code of a secure function
US20230126908A1 (en) Protection against executing injected malicious code
US20240193309A1 (en) Secure Cryptographic Coprocessor
CN113536331B (en) Data security for memory and computing systems
JP2024515450A (en) Read-Only Memory (ROM) Security
CN116982046A (en) Secure chip range communication
US12174939B2 (en) Method for the execution of a binary code of a computer program by a microprocessor
CN117216813B (en) Method, device and security chip for reading and writing data
CN110378117A (en) Control stream integrality detection method, apparatus and system
US11677541B2 (en) Method and device for secure code execution from external memory
Lanz High assurance cryptographic interface
US7822953B2 (en) Protection of a program against a trap

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant