[go: up one dir, main page]

CN115202892B - Memory expansion system and memory expansion method of cryptographic coprocessor - Google Patents

Memory expansion system and memory expansion method of cryptographic coprocessor Download PDF

Info

Publication number
CN115202892B
CN115202892B CN202211118592.8A CN202211118592A CN115202892B CN 115202892 B CN115202892 B CN 115202892B CN 202211118592 A CN202211118592 A CN 202211118592A CN 115202892 B CN115202892 B CN 115202892B
Authority
CN
China
Prior art keywords
memory
coprocessor
secret computing
host
page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211118592.8A
Other languages
Chinese (zh)
Other versions
CN115202892A (en
Inventor
邵乐希
蓝晏翔
王嘉平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Digital Economy Academy IDEA
Original Assignee
International Digital Economy Academy IDEA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Digital Economy Academy IDEA filed Critical International Digital Economy Academy IDEA
Priority to CN202211118592.8A priority Critical patent/CN115202892B/en
Publication of CN115202892A publication Critical patent/CN115202892A/en
Application granted granted Critical
Publication of CN115202892B publication Critical patent/CN115202892B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5022Mechanisms to release resources

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Storage Device Security (AREA)

Abstract

The invention relates to the technical field of secret computing coprocessors, in particular to a memory expansion system and a memory expansion method of a secret computing coprocessor. The invention firstly calculates the storage capacity required by the running application of the coprocessor according to the secret to obtain the size of the memory required to be expanded. And then the size of the memory to be expanded is sent to the host machine, and the host machine divides the internal memory into a part to be used as the memory of the confidential calculation coprocessor so as to assist the operation of the confidential calculation coprocessor. From the above analysis, the present invention expands the memory of the host computer into the memory of the secret computing coprocessor according to the size of the memory required by the secret computing coprocessor, thereby realizing the dynamic expansion of the memory of the secret computing coprocessor. In addition, the host machine is used for expanding the memory of the coprocessor, so that the speed of accessing the expanded memory by the coprocessor can be increased.

Description

Memory expansion system and memory expansion method of cryptographic coprocessor
Technical Field
The invention relates to the technical field of secret computing coprocessors, in particular to a memory expansion system and a memory expansion method of a secret computing coprocessor.
Background
With the development of coprocessor technology, a CPU can offload a specific task to a coprocessor (secret computing coprocessor) to process the specific task through the coprocessor, and release the CPU power to an upper application. Currently common coprocessors are GPUs (graphics processing units) and DPUs (data processing units). The volume of the coprocessor is generally small, the memory resource of each coprocessor is fixed after the coprocessor leaves a factory, and if the memory resource is insufficient during use, the problem of insufficient memory resource is generally solved by purchasing a new coprocessor.
In summary, the memory of the existing secret computing coprocessor is fixed and difficult to expand.
Thus, there is a need for improvements and enhancements in the art.
Disclosure of Invention
In order to solve the technical problems, the invention provides a memory expansion system and a memory expansion method of a secret computing coprocessor, and solves the problem that the memory of the existing secret computing coprocessor is fixed and is difficult to expand.
In order to realize the purpose, the invention adopts the following technical scheme:
in a first aspect, the present invention provides a memory expansion system for a cryptographic coprocessor, comprising the following components:
the secret computing coprocessor is a coprocessor to be expanded;
the controller is in bidirectional electrical connection with the secret computing coprocessor and is used for obtaining the memory which needs to be expanded by the secret computing coprocessor according to the memory storage capacity required by the running application of the secret computing coprocessor;
and the host machine is electrically connected with the controller in a bidirectional way and is used for providing the memory needing to be expanded for the confidential calculation coprocessor.
In one implementation, the secret computing coprocessor includes the following components:
the block device kernel module is used for inputting a request for expanding the memory;
and the first communication module is electrically connected with the block device kernel module, is bidirectionally electrically connected with the controller, and is used for transmitting the request of expanding the memory to the controller.
In one implementation, the controller includes the following components:
the protocol processing module is in bidirectional electrical connection with the first communication module and is used for transmitting a request of expanding the memory to the host and transmitting a request response of the host to the expanding memory to the first communication module;
the page management module is electrically connected with the host machine and is used for managing an expansion memory provided by the host machine for the confidential calculation coprocessor;
and the encryption and decryption module is respectively in bidirectional electric connection with the first communication module and the host machine and is used for encrypting data which needs to be written into the host machine by the secret computing coprocessor and decrypting data which needs to be read by the secret computing coprocessor from the host machine.
In one implementation, the following components are included:
the second communication module is respectively and electrically connected with the protocol processing module, the page management module and the encryption and decryption module and is used for enabling the host machine to be communicated with the controller;
and the memory management kernel module is respectively electrically connected with the second communication module and the page management module and is used for managing the expanded memory distributed to the confidential calculation coprocessor.
In a second aspect, an embodiment of the present invention further provides a memory expansion method for a cryptographic coprocessor, where the method includes:
acquiring the memory storage capacity required by the running application of the secret computing coprocessor, and recording as the required memory capacity;
obtaining a resident memory which the host computer should allocate to the confidential calculation coprocessor according to the required memory capacity;
and sending the memory information of the resident memory to the secret computing coprocessor to complete the memory expansion of the secret computing coprocessor.
In one implementation, the obtaining, according to the required memory capacity, a resident memory that a host should allocate to the secret computing coprocessor includes:
and transmitting the required memory capacity to a memory management kernel module of the host machine to obtain a resident memory which is controlled by the memory management kernel module and distributed to the confidential calculation coprocessor by the host machine.
In one implementation, the sending the memory information of the resident memory to the secret computing coprocessor to complete the memory expansion of the secret computing coprocessor includes:
receiving the total page number and the memory address information of the memory in the memory information transmitted by the memory management kernel module through the second communication module of the host;
establishing a page management structure aiming at the received total memory page number and the memory address information, wherein the page management structure is used for recording the corresponding relation between memory pages and the memory address information, and each memory page forms a resident memory allocated to the secret computing coprocessor;
after the page management structure is established, the total page number of the memory and the memory address information are sent to a block device kernel module through a first communication module of the secret computing coprocessor to complete memory expansion of the secret computing coprocessor, and the block device kernel module is located on the secret computing coprocessor.
In one implementation, sending the memory information of the resident memory to the secret computing coprocessor to complete memory expansion of the secret computing coprocessor, and then further comprising:
receiving the memory address information and the write-in content corresponding to a target page requested to be written in by the secret computing coprocessor, wherein the target page is located in the memory page allocated to the secret computing coprocessor by the host;
according to the memory address information, obtaining an IOMMU address located on the host machine in the memory address information;
sending the write content and the IOMMU address to the host.
In one implementation, the sending the write content and the IOMMU address to the host includes:
encrypting the written content to obtain the encrypted written content;
and sending the IOMMU address and the encrypted write content to the host machine through a remote direct data access method.
In one implementation, the sending the write content and the IOMMU address to the host further comprises:
receiving a write feedback instruction for completing writing the write content into a target page by the host machine;
and after receiving the write feedback instruction and when receiving that the current memory fed back by the secret computing coprocessor is larger than the set memory, sending an instruction for unloading switching equipment to the secret computing coprocessor, wherein the switching equipment is positioned on the secret computing coprocessor and is used for accessing the host machine.
In one implementation, the sending the write content and the IOMMU address to the host further comprises:
after a destroy request is received, destroying the written content;
and after the written content is destroyed, sending a destruction instruction to the host machine.
In one implementation, after receiving the destroy request, destroy the write content, and before further including:
controlling the host machine to write back the written content in the host machine to the secret computing coprocessor;
controlling the secret computing coprocessor to save the written content written back.
In one implementation, the sending the memory information of the resident memory to the secret computing coprocessor completes the memory expansion of the secret computing coprocessor, and then further includes:
receiving a target page requested to be read by the secret computing coprocessor;
according to the target page requested to be read, obtaining an IOMMU address on the host machine corresponding to the target page in the memory address information;
sending the IOMMU address corresponding to the target page requested to be read to the host machine;
receiving read content sent by the host, wherein the read content is positioned on a target page corresponding to the IOMMU address requested to be read;
sending the received read content to the secret computing coprocessor.
In a third aspect, an embodiment of the present invention further provides a memory expansion device for a cryptographic coprocessor, where the device includes the following components:
the data processing module is used for running the application program of the confidential calculation coprocessor and acquiring the memory storage capacity required by the application program, and recording the memory storage capacity as the required memory capacity;
and the communication management module is used for managing the communication of the data processing module and managing the extended resident memory obtained according to the required memory capacity.
Has the advantages that: the invention firstly obtains the memory size needing to be expanded according to the storage capacity required by the running application of the confidential calculation coprocessor. And then the size of the memory to be expanded is sent to the host machine, and the host machine divides the internal memory into a part to be used as the memory of the confidential calculation coprocessor so as to assist the operation of the confidential calculation coprocessor. From the above analysis, the present invention expands the memory of the host computer into the memory of the secret computing coprocessor according to the storage amount required by the secret computing coprocessor, that is, how much memory the secret computing coprocessor needs to expand, how much memory the host computer allocates to the secret computing coprocessor, thereby realizing the dynamic expansion of the memory of the secret computing coprocessor. In addition, the host machine is used for expanding the memory for the confidential calculation coprocessor instead of the solid state disk, and the speed of accessing the host machine by the confidential calculation coprocessor is far higher than the speed of accessing the solid state disk, so that the speed of accessing the expanded memory by the coprocessor by the host machine can be improved.
Drawings
FIG. 1 is a diagram of a memory expansion system according to the present invention;
FIG. 2 is a flow chart of expanding the memory according to the present invention.
Detailed Description
The technical solution of the present invention is clearly and completely described below with reference to the embodiments and the drawings. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It has been found that with the development of coprocessor technology, the CPU can offload specific tasks to the coprocessor (secret computing coprocessor) to process the specific tasks through the coprocessor, and release the CPU power to the upper layer application. Currently common coprocessors are GPUs (graphics processing units) and DPUs (data processing units). The coprocessor is generally small in size, the memory resource of each coprocessor is fixed after the coprocessor leaves a factory, if the memory resource is insufficient during use, the problem of insufficient memory resource is generally solved by purchasing a new coprocessor, namely the memory of the conventional confidential calculation coprocessor is fixed and difficult to expand.
In order to solve the technical problems, the invention provides a memory expansion system and a memory expansion method of a secret computing coprocessor, and solves the problem that the memory of the existing secret computing coprocessor is fixed and is difficult to expand. When the method is implemented specifically, firstly, the memory storage capacity required by the running application of the secret computing coprocessor is obtained, then the resident memory which the host computer needs to allocate to the secret computing coprocessor is obtained according to the required memory capacity, and then the memory information of the resident memory is sent to the secret computing coprocessor to complete the memory expansion of the secret computing coprocessor.
For example, when an application needs to be run on the secret computing coprocessor, a large amount of memory is needed to run the application, and the existing memory of the secret computing coprocessor is not enough to run the application, so that the embodiment sends a request that the secret computing coprocessor needs to expand the memory to the host, and the host divides a part of the memory of the host into parts for the secret computing coprocessor to use, thereby achieving the purpose of expanding the memory of the secret computing coprocessor. In addition, in the embodiment, a part of memory is divided from the host machine to be used by the confidential calculation coprocessor according to the memory dynamic occupied by the application to be run, so that the memory is dynamically expanded.
Exemplary System
A memory expansion system of a cryptographic coprocessor, as shown in fig. 1, includes the following components:
a secret computing coprocessor (SPU) which is a coprocessor to be expanded;
the controller (FPGA) is in bidirectional electrical connection with the secret computing coprocessor and is used for obtaining the memory which needs to be expanded by the secret computing coprocessor according to the memory storage capacity required by the running application of the secret computing coprocessor;
a HOST computer (HOST) in bidirectional electrical connection with the controller for providing the confidential calculation coprocessor with a memory to be expanded.
The following describes the secret computing coprocessor, the controller and the host computer respectively:
the secret computing coprocessor comprises a block device kernel module and a first communication module electrically connected with the block device kernel module.
The controller comprises a protocol processing module which is in bidirectional electric connection with a first communication module (PCIE communication module), a page management module which is in electric connection with the host machine, and an encryption and decryption module which is in bidirectional electric connection with the first communication module and the host machine respectively.
The host machine comprises a second communication module (PCIE communication module) which is respectively and electrically connected with the protocol processing module, the page management module and the encryption and decryption module, and a memory management kernel module which is respectively and electrically connected with the second communication module and the page management module.
When the memory of the secret computing coprocessor needs to be expanded, firstly, a request for expanding the memory is input to the kernel module of the block device, the request for expanding the memory is transmitted to the protocol processing module through the first communication module, the protocol processing module sends the request for expanding the memory of the SPU to the second communication module of the host, the second communication module transmits the request for expanding the memory of the SPU to the memory management kernel module, the memory management kernel module applies a part of resident memory of the host for expanding the memory of the SPU according to the size of the extended memory requested by the SPU, meanwhile, the memory management kernel module transmits the part of resident memory IOMMU address information of the resident memory to the page management module of the FPGA through the second communication module, the page management module manages the page number of the memory page contained in the resident memory allocated to the SPU by the host, the IOMMU address information corresponding to each memory page, and the check value of the page in a unified manner (the page number, the IOMMU address information and the check value correspond to each other one by one), and sends the page number to the kernel device of the SPU through the first communication module of the block device, and the memory expansion of the SPU is completed. In this embodiment, only the page number is sent to the SPU and the address information and the check value of the IOMMU are stored in the controller, because the corresponding address information and check value of the IOMMU can be found in the FPGA through the page number, so as to find the corresponding memory page in the host, and when the SPU needs to perform read-write operation on the target memory page in the host, the corresponding target memory page can be found according to the page number. Meanwhile, only the page number is stored in the SPU, so that the occupation of the SPU memory can be reduced, and the performance of the SPU for processing tasks can be fully exerted.
After the operation of expanding the SPU memory is completed, the SPU can perform read-write operation on the resident memory (recorded as the expanded memory) allocated to the SPU by the host:
when the SPU needs to read the content on the extended memory of the host, firstly, a read request and a page code are input to a block device kernel module through an operating system of the SPU, the block device kernel module sends the read request and the page code to a page management module of the FPGA through a first communication module, a protocol processing module finds corresponding IOMMU address information and a check value from the page management module according to the page code, then the protocol processing module finds a target page in a resident memory on the host according to the IOMMU address information, then reads the content of the target page to an encryption and decryption module, the encryption and decryption module decrypts the read content according to the check value, and the encryption and decryption module sends the decrypted content to a storage unit of the SPU through the first communication module.
When an SPU needs to write content into an extended Memory, an operating system of the SPU inputs a write request, a page number of a target page to be written, and content to be written to a block device kernel module of the SPU, the block device kernel module sends the write request, the page number of the target page, and the content to be written to a protocol processing module of a controller, the protocol processing module queries a DMA address (Direct Memory Access) corresponding to the page number from a page management module according to the page number of the target page, then queries an IOMMU address (input/output Memory management unit) corresponding to the page number according to the DMA address, finds the target page according to the ioaddress, then the protocol processing module transmits the content to be written to a encryption/decryption module for encryption, and the protocol processing module writes the encrypted write content to a target page of a host.
In the embodiment, in the reading and writing process of the SPU, all the page contents stored to the HOST side are encrypted, and a check value is stored in the FPGA to check whether the data is tampered, so that the integrity and the security of the contents are ensured. In addition, the FPGA controls the read-write operation of the SPU instead of the SPU, and the FPGA can reduce the occupation of the SPU memory by the read-write operation, thereby improving the performance of the SPU processing task.
In this embodiment, the functions of the confidential calculation coprocessor SPU, the controller FPGA, and the HOST are as follows:
block device kernel module of the secret computing coprocessor SPU: and realizing the read-write operation of the block device based on the page size of the operating system, and using the swap memory mechanism of the operating system to take the memory at the HOST end as the extended memory of the SPU.
When the SPU executes the read operation, the target page number is packaged to the FPGA communication protocol according to the read request of the operating system, the read request is sent to the FPGA, and the FPGA waits for the reply of the FPGA.
When the SPU executes the write operation, according to the write request of the operating system, the target page number is packaged to the FPGA communication protocol and the page content, the write request is sent to the FPGA, and the FPGA writes the write request into the memory address corresponding to the HOST.
PCIE communication module (first communication module) of the secret computing coprocessor SPU: is responsible for communication between the secret computing coprocessor SPU and the FPGA.
The protocol processing module of the controller FPGA: during initialization, the protocol of the SPU side for requesting the memory is transmitted to the HOST side, and the response of the HOST side is returned to the SPU. In operation, according to the content transmitted by SPU, the HOST page number to be operated is extracted. And inquiring the corresponding DMA address from the page management module. And according to the read or write request, reading or writing the DMA address.
Page management module of controller FPGA: and managing the HOST memory page, including the page number, the corresponding IOMMU address and the check value of the page.
The encryption and decryption module of the controller FPGA: if the request is a read request, decrypting the page content read from the HOST and the check value stored in the page management module; if the request is a write request, the incoming page content is encrypted (in an AES-GCM256 encryption mode), and the check value is stored in the page management module.
PCIE communication module of HOST (second communication module): and the communication module in the SPU realizes the communication between the host and the FPGA.
Memory management kernel module of HOST: during initialization, a memory request of an SPU side is received, a host side resident memory is applied according to the size of the request, and IOMMU address information is transmitted to the FPGA.
Exemplary method
The memory expansion method of the cryptographic coprocessor can be applied to terminal equipment, and the terminal equipment can be a terminal product with a computing function, such as a computer. In this embodiment, as shown in fig. 2, the method for expanding a memory of a secret computing coprocessor specifically includes the following steps:
s100, acquiring the memory storage capacity required by the secret computing coprocessor to run the application, and recording the memory storage capacity as the required memory capacity.
In this embodiment, each time an application is run by the secret computing coprocessor SPU, the capacity of the SPU that needs to be occupied by the application is computed, and when the SPU memory is insufficient, the resident memory of the corresponding capacity size on the host is divided into the SPU for the capacity that needs to be occupied by each application. In one embodiment, the amount of resident memory partitioned by the host into the SPUs is greater than the amount of memory required by the SPU to run the application to ensure that the SPU can run the application quickly.
S200, obtaining the resident memory which the host computer should allocate to the secret computing coprocessor according to the required memory capacity.
In this embodiment, the confidential calculation coprocessor SPU is plugged in the HOST via a PCIE module (plug connector), so that the SPU and the HOST can be conveniently connected.
In this embodiment, a block device kernel module is installed inside the SPU of the confidential calculation coprocessor, a user inputs a required memory capacity to the block device kernel module through an operating system of the SPU, the block device kernel module transmits the required memory capacity to the controller through the first communication module to be encrypted and then sends the encrypted memory capacity to the host, and the memory management kernel module of the host allocates a resident memory with a corresponding capacity according to the required memory capacity. For example, the required memory capacity is 10MB, and the capacity of each memory page is 3MB, then the memory management kernel module needs to control the host to divide 4 memory pages as the extended memory of the SPU, and the 4 memory pages form the resident memory.
In one embodiment, the required memory capacity is transmitted to a memory management kernel module of the host computer, and the obtained memory management kernel module controls the resident memory allocated to the confidential computation coprocessor by the host computer. In this embodiment, a protocol processing module of the controller transmits a required memory capacity received from a block device kernel module of the SPU to a memory management kernel module of the HOST (i.e., the block device kernel module of the SPU), and transmits the required memory size to the memory management module of the HOST through PCIE communication and an FPGA module, and the memory management kernel module applies a resident memory matched with the required memory capacity to a memory management unit of the HOST as an extended memory of the SPU, that is, the resident memory is only used for the SPU, so as to ensure that the SPU runs and applies the demand on the memory, and then, mutual interference between data of the SPU and data inside the HOST can be prevented.
S300, the memory information of the resident memory is sent to the secret computing coprocessor to complete the memory expansion of the secret computing coprocessor.
Step S300 includes the following steps:
and S301, receiving the total page number and the memory address information of the memory in the memory information transmitted by the memory management kernel module through the second communication module of the host machine.
S302, a page management structure is established for the received total number of pages of the memory and the memory address information, where the page management structure is used to record a corresponding relationship between memory pages and the memory address information, and each memory page constitutes a resident memory allocated to the secret computing coprocessor.
And S303, after the page management structure is established, the total page number of the memory and the memory address information are sent to a block device kernel module through a first communication module of the secret computing coprocessor to complete memory expansion of the secret computing coprocessor, wherein the block device kernel module is positioned on the secret computing coprocessor.
In this embodiment, after a page management structure is established in a controller, the total number of pages of a memory and memory address information are both sent to an SPU, and when the SPU needs to perform read-write operation, the SPU directly inputs the memory address information of a target memory page on which the read-write operation is performed and corresponding read-write content, so that the read-write content operation can be directly performed on the target memory page of a host;
in another embodiment, the memory address information includes the IOMMU address of the memory page in the host and the page number of the memory page. The page management structure is established to record the corresponding relationship among the address of the IOMMU, the page number and the check value, and is managed according to the starting and ending page addresses and the total page number, namely, the starting and ending page addresses + the offset. The method comprises the steps that after a page management structure is successfully established, the total page number of a memory is sent to an SPU, when the SPU needs to execute read-write operation, the page number of a target page of the read-write operation needing to be executed is sent to a controller, the controller can find a corresponding IOMMU address in the page management structure according to the page number, and then the read-write operation is executed on the target page of the memory where the IOMMU address is located.
Step S200 is to enable the memory management kernel module to control the host to allocate to the corresponding resident memory of the SPU to expand the memory of the SPU, and then the memory management kernel module sends the total number of pages of the memory included in the applied resident memory, the address of the IOMMU of each memory page in the host, and the page number of the memory page to the page management module of the controller through the second communication module, and the page management module establishes a page management structure, where the page management structure is used to record the total number of pages of the memory page included in the resident memory, and the DMA address of each memory page in the page management structure (the DMA address corresponds to the address of the IOMMU, that is, one DMA address corresponds to one address of the IOMMU), and when the SPU needs to execute read/write operations on the memory page of the host, the SPU sends the page number of the memory page to the controller, and the controller finds out the DMA address according to the page number, and then finds out the address of the IOMMU according to the address of the DMA address, and after completing the page management structure, the controller replies to the device driver module (i) to execute the write/write operations successfully, and prevent the SPU from using the SPU to execute the swap operation successfully.
Completing the operation of SPU memory extension through steps S100 to S300, namely completing the initialization of the SPU, and then when the SPU executes read-write operation and the self memory of the SPU side is in shortage, triggering page change by the operating system of the host, storing less frequently used memory pages in the system of the host into the swap partition, so as to vacate more memory pages for the SPU to use; and when the memory page to be stored in the swap partition is needed, loading the needed memory page from the swap partition to the SPU for use.
In one embodiment, when the SPU performs a write operation (i.e., writes the contents in the SPU to the host), it includes the following steps S401 to S404:
s401, receiving the memory address information and the write content corresponding to a target page requested to be written by the secret computing coprocessor, where the target page is located in the memory page allocated to the secret computing coprocessor by the host.
In this embodiment, the memory address information refers to a page number corresponding to the IOMMU address. And after receiving the write request, the block device kernel module of the SPU encapsulates the FPGA communication protocol, and writes the page number and the page content of the written target page into the FPGA communication packet. And sending the packet to the FPGA module through the PCIE communication module. The SPU block device kernel module replies that the SPU's operating system write request has completed. The controller FPGA receives the page number and the write contents contained in the packet sent by the SPU.
S402, according to the memory address information, obtaining the IOMMU address located on the host machine in the memory address information.
Because the page management structure of the FPGA stores the corresponding relation between the page number and the IOMMU address, the IOMMU address can be found according to the page number.
And S403, encrypting the written content to obtain the encrypted written content.
S404, sending the IOMMU address and the encrypted write content to the host machine through a remote direct data access method (RDMA).
And the encryption and decryption module of the FPGA encrypts the written content in an AES-GCM256 mode to obtain an encrypted tag, stores the encrypted tag in page information corresponding to the page management module, and acquires the IOMMU address of the page at the HOST side. The encrypted page content (write content after encryption) is written to the target memory page on the HOST side using RDMA. And when the memory is released, the information in the page management module is destroyed, and the memory management kernel module of the host machine is informed to release the memory. The releasing means that the FGPA clears the page content corresponding to the tag in the page management module, and sends a request for releasing the memory to the host machine so that the host machine releases the occupied memory, thereby preventing the information related to the written content from leaking.
In one embodiment, when the SPU performs a read operation (i.e., reads content from the host to the SPU), the following steps S501 to S505 are included:
s501, receiving the target page requested to be read by the secret computing coprocessor.
When the operating system of the SPU inputs the page number of the target page of the host machine needing to be read through the kernel module of the block device, the kernel module of the block device encapsulates the FPGA communication protocol after receiving the read request, and writes the page number of the read target page number into the FPGA communication packet. And sending the packet to a page management module of the FPGA through a PCIE communication module (a first communication module), and waiting for a reply of the FPGA module.
S502, according to the target page requested to be read, obtaining the IOMMU address on the host machine corresponding to the target page in the memory address information.
After a page management module of the FPGA receives a page number of a target page requested to be read, a page management structure acquires an IOMMU address and a check value of the target page on a HOST side.
S503, sending the IOMMU address corresponding to the target page requested to be read to the host.
S504, receiving the read content sent by the host, wherein the read content is located on the target page corresponding to the IOMMU address requested to be read.
S505, sending the received read content to the secret computing coprocessor.
The page management module of the FPGA sends the IOMMU address of the target page to the host, the host receives the IOMMU address to be read, and sends the content on the target page corresponding to the IOMMU address to the FPGA, and the FPGA sends the content on the memory page of the SPU (the SPU sends the read request and also sends a pointer of the local target memory page of the SPU, and the FPGA can send the read content to the local target memory page of the SPU according to the pointer, thereby ensuring that the read content is stored in the memory page specified by the SPU).
In another embodiment, the FPGA decrypts the read encrypted content on the target memory page of the host and the check value stored in the FPGA, and writes the decrypted content and the decrypted result into the FPGA communication packet, and returns the packet to the SPU side. The SPU block device reads the packet replied by the FPGA through the PCIE communication module, and if decryption is successful, copies the decrypted target memory page content to the local target memory page specified by the SPU operating system, and replies that the operating system read request is completed. If the decryption fails, the read request execution failure is returned to the operating system.
In one embodiment, when the SPU side finishes using the memory of the HOST or the HOST side needs to recycle the memory allocated to the SPU, the block device (block device kernel module) of the SPU determines whether the current memory of the SPU is enough, and if so, unloads the swap device. This embodiment includes the following process:
receiving a write feedback instruction for completing writing the write content into a target page by the host machine; and after receiving the write feedback instruction and when receiving that the current memory fed back by the secret computing coprocessor is larger than the set memory, sending an instruction for unloading a switching device to the secret computing coprocessor, wherein the switching device (swap device) is positioned on the secret computing coprocessor and is used for accessing the host machine.
In another embodiment, controlling the host to write back the write content in the host to the confidential compute coprocessor; the controller controls the secret computing coprocessor to store the written content, and when the controller receives a destroying request, the written content is destroyed; and after destroying the written content, the controller sends a destroying instruction to the host machine.
In this embodiment, the SPU operating system writes the memory page valid at the HOST back to the local memory page of the SPU, and therefore the reason for writing back to the local memory page of the SPU is as follows:
the reason for writing back the effective memory page of the HOST to the SPU is to avoid data loss caused by data destruction by the HOST, that is, the memory of the HOST is a memory expansion area of the SPU, and the SPU can put the memory pages with less access (such as the memory pages of an inactive daemon process) to the HOST to make up more local memory resources for use by the computing task. The valid memory pages stored in the Host memory are still in use. When the SPU side finishes the execution of the calculation task and needs to release the memory of the Host side, the effective memory page of the Host side needs to be written back into the SPU, otherwise, when the part of resources need to be accessed (for example, when an inactive daemon process is triggered), system exception is caused. The memory pages written back to the local of the SPU may prevent system exceptions.
After the operating system of the SPU writes the memory page valid at the HOST back to the local memory page of the SPU, the SPU side block device kernel module fills the destroy request (request for releasing resources) into the FPGA communication packet according to the protocol. After receiving the destroy request, the FPGA clears all page contents in the page management structure through RDMA (remote direct memory access) writing, clears the page management structure, encrypts a request (destroy instruction) for releasing resources and sends the encrypted request to the second communication management module at the HOST side for decryption, and then the memory management module releases the occupied resident memory according to the decrypted destroy instruction.
In summary, the present invention first obtains the memory size to be expanded according to the storage capacity required by the secret computing coprocessor to run the application. And then the size of the memory to be expanded is sent to the host machine, and the host machine divides the internal memory into a part to be used as the memory of the confidential calculation coprocessor so as to assist the operation of the confidential calculation coprocessor. From the above analysis, the present invention expands the memory of the host computer into the memory of the secret computing coprocessor according to the storage amount required by the secret computing coprocessor, that is, how much memory the secret computing coprocessor needs to expand, how much memory the host computer allocates to the secret computing coprocessor, thereby realizing the dynamic expansion of the memory of the secret computing coprocessor. In addition, the host machine is used for expanding the memory for the secret computing coprocessor instead of the solid state disk, and the speed of accessing the host machine by the secret computing coprocessor is far higher than the speed of accessing the solid state disk, so that the speed of accessing the expanded memory by the coprocessor by the host machine is improved.
In addition, the memory of the confidential calculation coprocessor SPU is expanded by using the memory of the HOST side, and when the memory of the SPU is insufficient, a relatively-infrequent memory page of the HOST side can be used as the expanded memory of the SPU, so that the effect of expanding the memory of the SPU is achieved. The SPU can more flexibly use the memory resource of the HOST side to accelerate the operation process, and is released back to the HOST when not used. Because the access speed of the memory and the speed of PCIE communication are far higher than the access speed of the solid state disk, the influence of the method on the running efficiency of the SPU is smaller than the influence when the solid state disk is used as a swap memory. Meanwhile, the solid state disk on the coprocessor is also a limited resource, so the invention saves the manufacturing cost of the coprocessor under the condition of affecting the processing efficiency of the coprocessor as little as possible.
Exemplary devices
The embodiment also provides a memory expansion device of the cryptographic coprocessor, which comprises the following components:
the data processing module is used for running the application program of the confidential calculation coprocessor and acquiring the memory storage capacity required by the application program, and recording the memory storage capacity as the required memory capacity;
and the communication management module is used for managing the communication of the data processing module and managing the extended resident memory obtained according to the required memory capacity.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, databases, or other media used in embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct Rambus Dynamic RAM (DRDRAM), and Rambus Dynamic RAM (RDRAM), among others.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (11)

1. A memory expansion system of a cryptographic coprocessor is characterized by comprising the following components:
the secret computing coprocessor is a coprocessor to be expanded;
the controller is in bidirectional electrical connection with the secret computing coprocessor and is used for obtaining the memory which needs to be expanded by the secret computing coprocessor according to the memory storage capacity required by the running application of the secret computing coprocessor;
the host machine is electrically connected with the controller in a bidirectional way and is used for providing a memory needing to be expanded for the confidential calculation coprocessor;
the secret computing coprocessor comprises the following components:
the block device kernel module is used for inputting a request for expanding the memory;
the first communication module is electrically connected with the block device kernel module, is bidirectionally electrically connected with the controller and is used for transmitting a request for expanding the memory to the controller;
the controller comprises the following components:
the protocol processing module is in bidirectional electrical connection with the first communication module and is used for transmitting the request of the extended memory to the host and transmitting the request response of the host to the extended memory to the first communication module;
and the page management module is electrically connected with the host machine and is used for managing the extended memory provided by the host machine for the confidential calculation coprocessor.
2. The memory expansion system of a secret computing coprocessor of claim 1, wherein the controller further comprises an encryption and decryption module, the encryption and decryption module being respectively in bidirectional electrical connection with the first communication module and the host machine, and being configured to encrypt data that the secret computing coprocessor needs to write to the host machine, and decrypt data that the secret computing coprocessor needs to read from the host machine.
3. The memory expansion system of a secret computing coprocessor of claim 2, said host comprising the following components:
the second communication module is respectively and electrically connected with the protocol processing module, the page management module and the encryption and decryption module and is used for enabling the host machine to be communicated with the controller;
and the memory management kernel module is electrically connected with the second communication module and the page management module respectively and is used for managing the expanded memory distributed to the confidential calculation coprocessor.
4. A memory expansion method of a cryptographic coprocessor is characterized by comprising the following steps:
acquiring the memory storage capacity required by the running application of the secret computing coprocessor, and recording the memory storage capacity as the required memory capacity;
obtaining resident memory which the host computer should allocate to the secret computing coprocessor according to the required memory capacity;
sending the memory information of the resident memory to the confidential calculation coprocessor to complete the memory expansion of the confidential calculation coprocessor;
obtaining the resident memory which the host computer should allocate to the secret computing coprocessor according to the required memory capacity comprises the following steps:
transmitting the required memory capacity to a memory management kernel module of the host machine to obtain a resident memory which is controlled by the memory management kernel module and distributed to the confidential calculation coprocessor by the host machine;
the sending the memory information of the resident memory to the secret computing coprocessor to complete the memory expansion of the secret computing coprocessor comprises the following steps:
receiving the total page number and the memory address information of the memory in the memory information transmitted by the memory management kernel module through the second communication module of the host;
establishing a page management structure aiming at the received total memory page number and the memory address information, wherein the page management structure is used for recording the corresponding relation between memory pages and the memory address information, and each memory page forms a resident memory allocated to the secret computing coprocessor;
after the page management structure is established, the total page number of the memory and the memory address information are sent to a block device kernel module through a first communication module of the secret computing coprocessor to complete memory expansion of the secret computing coprocessor, and the block device kernel module is located on the secret computing coprocessor.
5. The method of claim 4, wherein said memory resident information is sent to said secret computing coprocessor to complete memory expansion of said secret computing coprocessor, and thereafter further comprising:
receiving the memory address information and the write-in content corresponding to a target page requested to be written in by the secret computing coprocessor, wherein the target page is located in the memory page allocated to the secret computing coprocessor by the host;
according to the memory address information, obtaining an IOMMU address located on the host machine in the memory address information;
sending the write content and the IOMMU address to the host.
6. The method of memory expansion for a secret computing coprocessor of claim 5, wherein said sending said write content and said IOMMU address to said host comprises:
encrypting the written content to obtain the encrypted written content;
and sending the IOMMU address and the encrypted write content to the host machine through a remote direct data access method.
7. The method of memory expansion for a secret computing coprocessor of claim 5, wherein said sending said write content and said IOMMU address to said host, further comprises:
receiving a writing feedback instruction for the host machine to write the writing content into a target page;
and after receiving the write feedback instruction and when receiving that the current memory fed back by the confidential calculation coprocessor is larger than a set memory, sending an instruction for unloading switching equipment to the confidential calculation coprocessor, wherein the switching equipment is positioned on the confidential calculation coprocessor and is used for accessing the host machine.
8. The method of memory expansion for a secret computing coprocessor of claim 5, wherein said sending said write content and said IOMMU address to said host, further comprises:
after a destroy request is received, destroying the written content;
and after the written content is destroyed, a destruction instruction is sent to the host machine.
9. The method for memory expansion of a secret computing coprocessor according to claim 8, wherein said destroying said written content after said receiving a destroy request further comprises:
controlling the host machine to write back the written content in the host machine to the secret computing coprocessor;
controlling the secret computing coprocessor to save the written content written back.
10. The method of claim 4, wherein said sending said memory resident memory information to said secret computing coprocessor completes the memory expansion of said secret computing coprocessor, and thereafter further comprising:
receiving a target page requested to be read by the secret computing coprocessor;
according to the target page requested to be read, obtaining an IOMMU address on the host machine, corresponding to the target page, in the memory address information;
sending the IOMMU address corresponding to the target page requested to be read to the host machine;
receiving read content sent by the host, wherein the read content is positioned on a target page corresponding to the IOMMU address requested to be read;
sending the received read content to the secret computing coprocessor.
11. A memory expansion device of a cryptographic coprocessor is characterized by comprising the following components:
the data processing module is used for running an application program of the secret computing coprocessor and acquiring the memory storage capacity required by the application program, and recording the memory storage capacity as the required memory capacity;
the communication management module is used for managing the communication of the data processing module and managing the extended resident memory obtained according to the required memory capacity;
transmitting the required memory capacity to a memory management kernel module of a host machine to obtain a resident memory which is controlled by the memory management kernel module and distributed to the confidential calculation coprocessor by the host machine;
receiving the total page number and the memory address information of the memory in the memory information transmitted by the memory management kernel module through the second communication module of the host machine;
establishing a page management structure aiming at the received total memory page number and the memory address information, wherein the page management structure is used for recording the corresponding relation between memory pages and the memory address information, and each memory page forms a resident memory allocated to the secret computing coprocessor;
after the page management structure is established, the total number of pages of the memory and the memory address information are sent to a block device kernel module through a first communication module of the confidential calculation coprocessor to complete memory expansion of the confidential calculation coprocessor, and the block device kernel module is located on the confidential calculation coprocessor.
CN202211118592.8A 2022-09-15 2022-09-15 Memory expansion system and memory expansion method of cryptographic coprocessor Active CN115202892B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211118592.8A CN115202892B (en) 2022-09-15 2022-09-15 Memory expansion system and memory expansion method of cryptographic coprocessor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211118592.8A CN115202892B (en) 2022-09-15 2022-09-15 Memory expansion system and memory expansion method of cryptographic coprocessor

Publications (2)

Publication Number Publication Date
CN115202892A CN115202892A (en) 2022-10-18
CN115202892B true CN115202892B (en) 2022-12-23

Family

ID=83571978

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211118592.8A Active CN115202892B (en) 2022-09-15 2022-09-15 Memory expansion system and memory expansion method of cryptographic coprocessor

Country Status (1)

Country Link
CN (1) CN115202892B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117950921B (en) * 2024-03-20 2024-07-23 新华三信息技术有限公司 Memory fault processing method, memory expansion control device, electronic device and medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102576299A (en) * 2009-09-10 2012-07-11 先进微装置公司 Systems and methods for processing memory requests
CN113051072A (en) * 2021-03-02 2021-06-29 长沙景嘉微电子股份有限公司 Memory management method, device, system and computer readable storage medium
CN113312182A (en) * 2021-07-27 2021-08-27 阿里云计算有限公司 Cloud computing node, file processing method and device
CN114185818A (en) * 2022-02-15 2022-03-15 摩尔线程智能科技(北京)有限责任公司 GPU memory access adaptive optimization method and device based on extended page table
CN114625536A (en) * 2022-03-15 2022-06-14 北京有竹居网络技术有限公司 Video memory allocation method, device, medium and electronic equipment
CN114880130A (en) * 2022-07-11 2022-08-09 中国科学技术大学 Method, system, device and storage medium for breaking memory limitation in parallel training
CN114968567A (en) * 2022-05-17 2022-08-30 北京百度网讯科技有限公司 Method, apparatus and medium for allocating computing resources of a compute node

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150091912A1 (en) * 2013-09-27 2015-04-02 Nvidia Corporation Independent memory heaps for scalable link interface technology
CN103744736B (en) * 2014-01-09 2018-10-02 深圳Tcl新技术有限公司 The method and Linux terminal of memory management
US10339052B2 (en) * 2015-03-11 2019-07-02 Centileo Llc Massive access request for out-of-core textures by a parallel processor with limited memory
CN105095138B (en) * 2015-06-29 2018-05-04 中国科学院计算技术研究所 A kind of method and apparatus for extending isochronous memory bus functionality
US9823871B2 (en) * 2015-10-09 2017-11-21 Oracle International Corporation Performance of coprocessor assisted memset() through heterogeneous computing
CN114385516A (en) * 2020-10-21 2022-04-22 澜起科技股份有限公司 Computing system and method for sharing device memory of different computing devices

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102576299A (en) * 2009-09-10 2012-07-11 先进微装置公司 Systems and methods for processing memory requests
CN113051072A (en) * 2021-03-02 2021-06-29 长沙景嘉微电子股份有限公司 Memory management method, device, system and computer readable storage medium
CN113312182A (en) * 2021-07-27 2021-08-27 阿里云计算有限公司 Cloud computing node, file processing method and device
CN114185818A (en) * 2022-02-15 2022-03-15 摩尔线程智能科技(北京)有限责任公司 GPU memory access adaptive optimization method and device based on extended page table
CN114625536A (en) * 2022-03-15 2022-06-14 北京有竹居网络技术有限公司 Video memory allocation method, device, medium and electronic equipment
CN114968567A (en) * 2022-05-17 2022-08-30 北京百度网讯科技有限公司 Method, apparatus and medium for allocating computing resources of a compute node
CN114880130A (en) * 2022-07-11 2022-08-09 中国科学技术大学 Method, system, device and storage medium for breaking memory limitation in parallel training

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
"ASIP acceleration for virtual-to-physical address translation on RDMA-enabled FPGA-based network interfaces";Roberto Ammendola等;《Future Generation Computer Systems》;20150112;第53卷;第109-118页 *
"FPGA-based hardware acceleration for local complexity analysis of massive genomic data";Agathoklis Papadopoulos等;《Integration》;20121108;第46卷(第03期);第230-239页 *
"基于FPGA的异构计算研究及实现";李青峰;《中国优秀硕士学位论文全文数据库 信息科技辑》;20210515(第05期);I135-426 *
"面向深度学习平台的内存管理器的设计与实现";王显宏;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190115(第01期);I137-154 *
基于图形处理单元的数字全息图加速再现算法研究;丁鹤平等;《中国激光》;20101110;第37卷(第11期);第2901-2905页 *

Also Published As

Publication number Publication date
CN115202892A (en) 2022-10-18

Similar Documents

Publication Publication Date Title
EP3049944B1 (en) Block storage apertures to persistent memory
US20180218156A1 (en) Encryption and Decryption Method and Apparatus in Virtualization System, and System
US11847225B2 (en) Blocking access to firmware by units of system on chip
US11455430B2 (en) Secure element and related device
US20070223688A1 (en) Architecture of an encryption circuit implementing various types of encryption algorithms simultaneously without a loss of performance
KR102331926B1 (en) Operation method of host system including storage device and operation method of storage device controller
CN109558372B (en) Apparatus and method for secure processor
CN112906075A (en) Memory sharing method and device
CN110554911A (en) Memory access and allocation method, memory controller and system
CN111090869A (en) Data encryption method, processor and computer equipment
CN115202892B (en) Memory expansion system and memory expansion method of cryptographic coprocessor
CN114237817A (en) Virtual machine data reading and writing method and related device
CN112698789B (en) Data caching method, device, equipment and storage medium
US11734197B2 (en) Methods and systems for resilient encryption of data in memory
US11269549B2 (en) Storage device and command processing method
CN111949372B (en) A virtual machine migration method, general-purpose processor and electronic device
CN118886043A (en) Data transmission method, device, equipment and storage medium
US20220035957A1 (en) Methods and systems for data backup and recovery on power failure
CN113127896A (en) Data processing method and device based on independent encryption chip
CN116048716A (en) Direct storage access method and device and related equipment
US8468493B2 (en) Information processing apparatus, information processing method, and program
CN115718707A (en) Data transmission method and device, computer equipment and storage medium
CN111859420A (en) Data encryption method and device, data decryption method and device and storage medium
US20240184931A1 (en) Storage device, operating method thereof, and system for providing safe storage space between application and storage device on application-by-application basis
CN116226870B (en) Security enhancement system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20221018

Assignee: Shenzhen Qiangji Computing Technology Co.,Ltd.

Assignor: Guangdong Hong Kong Macao Dawan District Digital Economy Research Institute (Futian)

Contract record no.: X2023980045750

Denomination of invention: A Memory Expansion System and Method for a Confidential Computing Coprocessor

Granted publication date: 20221223

License type: Exclusive License

Record date: 20231103

EE01 Entry into force of recordation of patent licensing contract