The present application claims priority from israel patent application 289,002 filed on 12/14 of 2021, the disclosure of which is incorporated herein by reference.
Disclosure of Invention
There is provided, in accordance with an embodiment of the present disclosure, a secure distributed processing system including a plurality of nodes connected by a network and configured to process a plurality of tasks, wherein each respective one of the nodes includes: a respective processor for processing data of respective ones of the tasks; and a corresponding network interface controller NIC for: connecting with other nodes in the nodes through the network; storing a task master key used in calculating a communication key to ensure data transmission security through the network for a corresponding one of the tasks; calculating a corresponding task and node pair specific communication key for securing communication with a corresponding one of the nodes over the network for the corresponding one of the tasks in response to the corresponding one of the task master keys and node specific data of the corresponding node pair; and securely communicate, over the network, processed data of the respective ones of the tasks with the respective ones of the nodes in response to the respective tasks and node pair specific communication keys.
Further in accordance with an embodiment of the present disclosure, each respective one of the tasks is processed as a respective distributed process by more than one of the nodes.
According to an embodiment of the disclosure, the corresponding task of the tasks is performed for the tenant.
Further in accordance with an embodiment of the present disclosure, the respective processors include graphics processing units, GPUs, configured to process data of respective ones of the tasks for respective ones of the tenants.
Furthermore, according to embodiments of the present disclosure, each respective one of the tasks is handled as a respective distributed process by more than one of the nodes.
Further in accordance with an embodiment of the present disclosure, one of the nodes in each of the respective node pairs is configured to generate a respective random number, the node specific data of each of the respective node pairs including the respective random number.
According yet further to an embodiment of the disclosure, the node specific data of the respective node pair comprises address information of the respective node pair.
Further in accordance with an embodiment of the present disclosure, the node-specific data of the respective node pair includes address information of the respective node pair.
Further, according to an embodiment of the present disclosure, the respective NIC is configured to calculate the task and node pair specific communication keys in response to establishing a new connection with a respective one of the nodes over the network, such that for each new connection with a respective one of the nodes, the respective NIC is configured to calculate a corresponding new task and node pair specific communication key.
Further in accordance with an embodiment of the present disclosure, the respective NIC is configured to: establishing a first connection with a given one of the nodes; calculating a first task and node pair specific communication key for the first connection in response to the first random number; securely communicate with the given one of the nodes in response to the first task and a node-to-particular communication key; removing the first connection; establishing a second connection with a given one of the nodes; calculating a second task and node pair specific communication key for the second connection in response to a second random number different from the first random number; and securely communicate with the given one of the nodes in response to the second task and a node-to-particular communication key.
According to yet another embodiment of the disclosure, each respective one of the tasks is handled as a respective distributed process by more than one of the nodes.
Further in accordance with an embodiment of the present disclosure, the respective NIC is configured to: generating the first random number in response to a first connection request from the given one of the nodes; and generating the second random number in response to a second connection request from the given one of the nodes.
Furthermore, according to an embodiment of the present disclosure, the respective NIC is configured to: calculating the first task and node pair specific communication key for the first connection in response to the first random number and address information of the respective node of the respective NIC and the given one of the nodes; and calculating the second task and node pair specific communication key for the second connection in response to the second random number and the address information of the respective node of the respective NIC and the given one of the nodes.
Further in accordance with an embodiment of the present disclosure, the respective NIC is configured to: reserving hardware resources to establish a connection with a given one of the nodes in response to a request from the respective node; and in response to unsuccessful decryption of data received from a given one of the nodes, canceling reservation of the hardware resource after a given timeout.
According to another embodiment of the present disclosure, there is also provided a secure distributed processing method including: processing data of a corresponding task of the plurality of tasks; connecting with other nodes of a plurality of nodes connected through a network; storing a task master key used in calculating a communication key for ensuring data transmission security through a network for a corresponding task of the tasks; calculating a corresponding task and node pair specific communication key for securing communication with a corresponding one of the nodes over the network for the corresponding one of the tasks in response to the corresponding one of the task master keys and node specific data of the corresponding node pair; and securely communicate, over the network, processed data of the respective ones of the tasks with the respective ones of the nodes in response to the respective tasks and node pair specific communication keys.
In accordance with further embodiments of the present disclosure, each respective one of the tasks is processed as a respective distributed process by more than one of the nodes.
Further in accordance with an embodiment of the present disclosure, respective ones of the tasks are performed by a graphics processing unit GPU for the tenant.
Furthermore, according to an embodiment of the present disclosure, a method includes: a respective random number is generated for each of the respective node pairs, the node specific data for each of the respective node pairs including the respective random number.
Further in accordance with an embodiment of the present disclosure, the node-specific data of the respective node pair includes address information of the respective node pair.
In accordance with yet another embodiment of the disclosure, the computing includes computing the task and node pair specific communication keys in response to establishing new connections with respective ones of the nodes through the network such that for each new connection with a respective one of the nodes, a corresponding new task and node pair specific communication key is computed.
Further in accordance with an embodiment of the present disclosure, the method includes: establishing a first connection with a given one of the nodes; calculating a first task and node pair specific communication key for the first connection in response to the first random number; securely communicate with a given one of the nodes in response to the first task and a node-to-particular communication key; removing the first connection; establishing a second connection with a given one of the nodes; calculating a second task and node pair specific communication key for the second connection in response to a second random number different from the first random number; and securely communicate with the given one of the nodes in response to the second task and a node-to-particular communication key.
Furthermore, according to an embodiment of the present disclosure, the method comprises: generating the first random number in response to a first connection request from a given one of the nodes; and generating the second random number in response to a second connection request from a given one of the nodes.
Further in accordance with an embodiment of the present disclosure, calculating the first task and node pair specific communication key includes: calculating the first task and node pair specific communication key for the first connection in response to the first random number and address information of the respective node of the respective NIC and the given one of the nodes; and calculating the second task and node pair specific communication key comprises: responsive to the second random number and the respective node and the section of the respective NIC
The address information of the given node in a point, calculating the second 5-task and node pair specific communication key for the second connection.
According still further to an embodiment of the disclosure, the method comprises: reserving hardware resources to establish a connection in response to a request from a given one of the nodes; and in response to unsuccessful decryption of data received from a given one of the nodes, canceling reservation of the hardware resources after a given timeout.
Detailed Description
SUMMARY
A feature of high performance processing applications is that jobs or tasks are distributed to a plurality of servers or processing nodes that process jobs or tasks in a distributed manner. In the presence of jobs or tasks
And during the continuous time, the processing nodes communicate with each other. Such communication is typically not one-to-one, 5, but many-to-many or all-to-all between nodes. Handling a task may involve thousands or tens of thousands of connections. Is that
The communication is secured and the data exchange between the nodes is typically encrypted using one or more appropriate keys.
If all nodes use the same key, security will be poor because the key may be
Can be discovered with known attacks. If different keys are used between different pairs of nodes, e.g., 0 using a secure key sharing algorithm such as Diffie Hellman, then a cryptographic key is established between each pair of nodes
The key would require a significant amount of memory, complexity and processing time. For example, since each secure connection includes states, communicating with N end nodes would require the processing node to use a large amount of memory to save the N states.
Accordingly, embodiments of the present invention address the above-described problems by offloading data security to processing nodes. The master key is securely distributed to each processing node for each task or job. Thus, each processing node receives and stores a set of task master keys, one for each task. When a node needs to communicate with another node, for a given task, a task and node pair specific communication key is calculated by each node based on the master key and other node pair specific data for the given task. The key is specific to a given task and node pair data. The node pair specific data may comprise a random number generated by one of the nodes in the communication node pair and/or based on address information of the node pair. The node pair may then securely communicate using the calculated task and the node pair specific communication key. Once communication between the node pairs is complete, the calculated key is discarded. New communications for the same node pair or a different node pair typically result in new corresponding tasks and node pair specific communication keys being calculated by the corresponding node pair.
For example, node a may send a request to node B to communicate securely for a given task. The request may include an index of the master key for the given task or an identity of the given task. The node B may then generate a random or pseudo-random number and respond to the request with the random number. Node a and node B may then calculate a task and node pair specific communication key based on the master key and the generated random number for the given task and optionally based on address information of node a and/or node B. Node a may then encrypt data for the given task using the calculated task and the node-specific communication key. The data is sent to the node B, which decrypts the data using the calculated task for the given task and the node's specific communication key.
InfiniBand (IB) uses Dynamic Connections (DC) to dynamically connect node pairs by using fewer resources than static connections (e.g., infiniBand reliable connections) and associating hardware resources with the DC connections. When a dynamic connection is established, there is a brief handshake, including a request and an acknowledgement. When the connection is completed, the hardware resources are released for use by another connection. In addition to establishing connections between node pairs, the DC mechanism may be extended to pass a newly generated random number per dynamic connection, thereby enabling node pairs to generate task and node pair specific communication keys per dynamic connection. Since each communication connection communication key is refreshed, the key is protected from replay attacks, and thus there is no need for information that would normally need to be stored in a cryptographic state to prevent replay attacks. In fact, the cryptographic state may be cleared at each connection, thereby reducing the state data that needs to be stored by the node. IB DC allows the connection to be kept connected only when the node is transmitting data. Embodiments of the present invention allow a node to maintain a cryptographic state and IB-DC state for each active connection.
Embodiments of the present invention may be implemented without using IB DC. For example, other dynamic types of connections may be used, or static connections may be used. In some embodiments, multiple connections (e.g., non-IB DC connections) in the same security domain may share a key.
System description
Reference is now made to FIG. 1, which is a block diagram illustration of a secure distributed processing system 10 constructed and operative in accordance with an embodiment of the present invention. The secure distributed processing system 10 includes a plurality of nodes 12 (only 4 shown for simplicity) connected by a network 14, the plurality of nodes configured to process a plurality of tasks 16. In some embodiments, each task 16 may be processed by more than one node 12 as a respective distributed process. Each node 12 includes a processor 18 for processing data for a respective one of the tasks 16, and a network interface controller 20 (NIC) described in more detail with reference to fig. 2-5. Processor 18 may include a CPU 26 and/or a GPU 28.
The secure distributed processing system 10 includes a coordination mode 22 that generates a corresponding task master key 24 for the task 16 and distributes the task master key 24 to each node 12.
In some embodiments, the task 16 may be performed for the tenant 30. For example, nodes 12 may process data for different tenants 30, e.g., different companies, that lease processing space in the secure distributed processing system 10, so each node 12 may process data for different tenants 30 at the same time or at different times. One of the tasks 16 may represent one, some, or all of the processes performed for the respective tenant 30. In other words, in some embodiments, all processes of a given tenant may be categorized as the same task or job. For one of nodes 12, GPU 28 and/or CPU 26 of that node 12 is configured to process data for a respective one of tasks 16 of a respective one of tenants 30.
In practice, some or all of the functionality of the processor 18 may be combined in a single physical component or, alternatively, implemented using multiple physical components. These physical components may include hardwired or programmable devices, or a combination of both. In some embodiments, at least a portion of the functions of the processor 18 may be performed by a programmable processor under the control of suitable software. For example, the software may be downloaded to the device in electronic form over a network. Alternatively, or in addition, the software may be stored in a tangible, non-transitory computer readable storage medium, such as optical, magnetic, or electronic memory.
Graphics Processing Units (GPUs) are used to generate three-dimensional (3D) and two-dimensional (2D) graphical objects for various applications, including thematic sheets, computer games, virtual Reality (VR) and Augmented Reality (AR) experiences, mechanical designs, and/or the like. Modern GPUs include texture processing hardware for generating surface appearances, referred to herein as "surface textures", for 3D objects in a 3D graphics scene. Texture processing hardware applies surface appearance to a 3D object by "wrapping" the appropriate surface texture on the 3D object. This process of generating and applying surface textures onto 3D objects results in a highly realistic appearance of these 3D objects in a 3D graphics scene.
Texture processing hardware is configured to execute various texture-related instructions, including texture operations and textures
And (5) managing the load. Texture processing hardware generates access texture information by generating memory references (referred to herein as "queries 5") to texture memory. Texture processing hardware is changed from texture memory
Such as when rendering object surfaces in a 3D graphics scene for display on a display device, when rendering a 2D graphics scene, or during computing operations.
The surface texture information includes information for texturing or shading an object surface in a 3D graphics scene
A colored texel (referred to herein as "texel"). The 0 texture processing hardware and associated texture caches are optimized for efficient, high throughput read-only access to support high demand for texture information during graphics rendering with little or no write operation. In addition, texture processing hardware includes specialized functional units to perform various texture operations, such as level of detail (LOD) computation, texture sampling, and texture filtering.
In general, texture operations include querying multiple textures around a particular point of interest in 3D space
Elements, and then various filtering and interpolation operations are performed to determine the final color of the point of interest. In contrast, 5, texture loads typically query individual texels and return them directly to the user application
The rows are further processed. Because filtering and interpolation operations typically involve querying four or more texels per processing thread, texture processing hardware is conventionally built to accommodate the generation of multiple queries per thread. For example, texture processing hardware may be built to accommodate within a single memory cycle
Up to four texture memory queries are performed. In this way, texture processing hardware is able to query and receive most or all of the required texture information in one memory 0 cycle.
Referring now to FIG. 2, there is a flow chart 200 including steps in a method of operation of the system 10 of FIG. 1. The method described with reference to fig. 2 is for one of the nodes 12, and that node 12
As described by the processor 18 and the network interface controller 20.
The processor 18 is configured to process (block 202) data of respective ones of the tasks 16. The network 5 network interface controller 20 is configured to store (block 204) the task master key 24 for computing communications
Keys to ensure data transmission through the network 14 for respective ones of the tasks 16. The network interface controller 20 is configured to connect (block 206) to other nodes 12 over the network 14.
In some embodiments, one of each respective pair of nodes attempting to establish a respective connection
The network interface controller 20 of the respective node is configured to generate (block 0) a respective random number (for each received connection request) (randomly or pseudo-randomly). For example, for the first node
For a first connection between pairs, one node in a first pair of nodes generates a first random number, and for a second connection between a second pair of nodes, one node in the first pair generates a second random number, and so on.
The network interface controller 20 is configured to calculate (block 210) respective task and node pair specific communication keys to ensure communication with respective ones of the nodes 12 over the network 14 for respective ones of the tasks 16 in response to respective ones of the task master keys 24 and node specific data for the respective node pairs. For example, the task and node pair specific communication keys are calculated by inputting the corresponding task master key in the task master key 24 and the node specific data of the corresponding node pair into an appropriate key calculation function or algorithm. The node specific data of each of the respective node pairs 12 may include a respective random number and/or respective node pair address information. The task and node pair specific communication keys may be calculated using any suitable algorithm, such as HMAC-SHA or CMAC.
For example, for task X between nodes A and B, the master key of task X and node specific data of nodes A and B (e.g., including random numbers generated by nodes A or B) are used to calculate the task and node pair specific communication keys, while for task Y between nodes C and B, the master key of task Y and node specific data of nodes C and B (e.g., including random numbers generated by nodes C or B) are used to calculate the task and node pair specific communication keys.
By way of another example, node a may send a request to node B to communicate securely for a given task. The request may include an index of the master key for the given task or an identity of the given task. The node B may then generate a random or pseudo-random number and respond to the request with the random number. Node a and node B may then calculate a task and node pair specific communication key based on the master key and the generated random number for the given task and optionally based on address information of node a and/or node B.
In some embodiments, the network interface controller 20 is configured to calculate the task and node pair specific communication keys in response to establishing a new connection with the other nodes 12 over the network 14, and thus for each new connection with a respective one of the nodes 12, the network interface controller 20 is configured to calculate the corresponding new task and node pair specific communication key.
The network interface controller 20 is configured to securely communicate (block 212) processed data of the task 16 (or data processed by other nodes 12) with respective ones of the nodes 12 over the network 14 in response to the respective tasks and node pairs particular communication keys. For example, a task and node pair specific communication key a is used to communicate with node a for a given task, while a task and node pair specific communication key B is used to communicate with node B for a given task or a different task. By way of another example, node a may encrypt data for a given task using the calculated task and the node-specific communication key. The data is sent to the node B, which decrypts the data using the calculated task and node pair specific communication key for the given task.
In some embodiments, the password state history need not be saved. The cryptographic state may include a playback window that may include the currently connected data, but need not include any historical data. The cryptographic state may hold the generated random number, a pointer to the task master key, and a playback window that has been reset.
Referring now to fig. 3, fig. 3 is a process and information flow diagram 300 including steps in a method of operation of two nodes in the system 10 of fig. 1. Fig. 3 shows a process performed by a node pair comprising node a and node B and data communicated between node a and node B.
Node a is processing data for task X. Node a reserves (block 302) hardware resources in its network interface controller 20 to establish a connection (e.g., IB DC connection) and may optionally establish a state (e.g., IB DC state) to handle the connection. Node a sends (block 304) a connection request to node B. The connection request includes an index of the task master key 24 associated with task X or the Identification (ID) of task X. The node B receives the request and reserves (block 306) hardware resources in its network interface controller 20 to support the requested connection and may optionally establish a state (e.g., IB DC state) to handle the connection. As described in more detail with reference to fig. 5, node B may release reserved hardware resources if node a does not transmit relevant data within a given timeout. The node B generates (block 308) a random number (randomly or pseudo-randomly). Node B responds (block 309) to node a with the generated random number. Based on the task master key 24 of task X, the generated random number, and optionally the address information of node A and/or node B, node A and node B each generate (block 310) the same task and node pair specific communication keys. Node a encrypts the processed data for task X (block 312) and sends (block 314) the encrypted processed data to node B. The node B receives the encrypted processed data and decrypts the processed data (block 316). Once the communication between the nodes is complete, node a and node B release (block 318) the hardware resources, e.g., based on a notification from node a or node B or after a timeout of no communication.
Referring now to fig. 4, fig. 4 is a flowchart 400 including steps of a method of operation of node 12 in system 10 of fig. 1. Fig. 4 depicts one node setting up and tearing down two connections. The two connections may be active at the same time or at different times. The two connections may be for the same node pair or different node pairs.
The network interface controller 20 is configured to establish (block 402) a first connection (e.g., reserve hardware resources and establish state) with a given one of the nodes 12, optionally generate (block 404) a first random number (optionally in response to a first connection request from the given node 12) or receive the first random number from the given node 12, calculate (block 406) a first task and a node-to-specific communication key of the first connection in response to the first random number and/or address information of the node pair and a master key of the first task, securely communicate (block 408) with the given node 12 in response to the first task and the node-to-specific communication key, and tear down (block 410) the first connection once the communication is completed.
The network interface controller 20 is configured to establish (block 412) a second connection (e.g., reserve hardware resources and establish state) with a given one of the nodes 12, optionally generate (block 414) a second random number (in response to a second connection request from the given node 12) or receive the second random number from the given node 12, calculate (block 416) a second task and a node pair specific communication key of the second connection in response to the second random number (different from the first random number) and/or address information of the node pair and a master key of the second task, securely communicate (block 418) with the given node 12 in response to the second task and the node pair specific communication key, and tear down (block 420) the second connection once the communication is completed.
The secure communication may include one of the nodes encrypting data for transmission to the other node in the node pair for decryption (e.g., one-way secure communication) or both nodes encrypting data for transmission to the other node for decryption by the other node in the node pair (e.g., two-way secure communication).
Referring now to fig. 5, fig. 5 is a flow chart 500 including steps of a hardware resource reservation method in the system 10 of fig. 1.
In some embodiments, a timeout may be used to prevent connection establishment and resource reservation in the event that communication is not initiated. In this way, denial of service attacks may be blocked by an attacker without access to the task master key 24. Thus, if communication does not begin before the timeout ends, the connection is torn down and the hardware resources are released.
Thus, the network interface controller 20 of node a is configured to reserve (block 502) hardware resources in response to a request from node B to establish a connection with node a. At decision block 504, the network interface controller 20 of node a is configured to check whether data has been received from node B and successfully decrypted within a given timeout. If data has been received from the node B and successfully decrypted within a given timeout, the network interface controller 20 is configured to set the allocation of connection resources to the last (block 506) (i.e., without further checking for a timeout). In response to not successfully decrypting the data received from the node B within a given timeout, the network interface controller 20 is configured to cancel the reservation of the reserved hardware resources (block 508).
In other embodiments, no timeout is implemented in terms of reserving resources, and resource management may be handled in any suitable manner.
Various features of the invention which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination.
The embodiments described above are mentioned by way of example and the invention is not limited to what has been particularly shown and described hereinabove. Rather, the scope of the present invention includes both combinations and sub-combinations of the various features described hereinabove, as well as variations and modifications thereof which would occur to persons skilled in the art upon reading the foregoing description and which are not disclosed in the prior art.