CN118075198A - Method and equipment for planning path of high-speed network topology structure - Google Patents
Method and equipment for planning path of high-speed network topology structure Download PDFInfo
- Publication number
- CN118075198A CN118075198A CN202410184204.9A CN202410184204A CN118075198A CN 118075198 A CN118075198 A CN 118075198A CN 202410184204 A CN202410184204 A CN 202410184204A CN 118075198 A CN118075198 A CN 118075198A
- Authority
- CN
- China
- Prior art keywords
- path planning
- logic blocks
- path
- network
- logic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000004891 communication Methods 0.000 claims description 4
- 230000002093 peripheral effect Effects 0.000 claims description 4
- 238000002955 isolation Methods 0.000 claims description 3
- 238000000638 solvent extraction Methods 0.000 claims description 2
- 238000012549 training Methods 0.000 abstract description 8
- 238000004364 calculation method Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 10
- 238000004422 calculation algorithm Methods 0.000 description 9
- 230000001174 ascending effect Effects 0.000 description 5
- 238000011161 development Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 238000013136 deep learning model Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 239000004744 fabric Substances 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 239000004283 Sodium sorbate Substances 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 235000013861 fat-free Nutrition 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 239000004302 potassium sorbate Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000002151 riboflavin Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000004149 tartrazine Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/48—Routing tree calculation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/02—Topology update or discovery
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/12—Shortest path evaluation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/38—Flow based routing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/12—Avoiding congestion; Recovering from congestion
- H04L47/125—Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention relates to a method and equipment for planning a path of a high-speed network topological structure, which divide an integral nonstandard network topological physical sub-network into fat tree logic blocks and other tree structure logic blocks, so that the network topological characteristic is better exerted in the fat tree logic blocks, the network performance of other tree structure logic blocks is improved, the occurrence of network congestion is reduced on the whole, the problem that the congestion of the network possibly directly causes the failure of GPU calculation force in a large-scale cluster and a distributed training scene is solved, the occurrence of network congestion is reduced on the whole, the integral throughput is improved, the network configuration is optimized, the higher bandwidth is realized, and the integral use efficiency of the cluster is obviously improved.
Description
Technical Field
The embodiment of the invention relates to the field of network communication, in particular to a method and equipment for planning a path of a high-speed network topology structure.
Background
High performance computing power is an important weapon for deep learning developers and researchers to assist their research success. For factors affecting the speed of deep learning training, one often easily ignores the important role of network transmission in training acceleration, especially in a large-scale cluster, distributed training scenario, network congestion may directly lead to failure of GPU computing power.
In practice, fat tree topology (fat-tree topology) is used for High Performance Computing (HPC) clusters and for InfiniBandTM (IB) technology-based clusters, arranging the network topology into a hierarchical, multi-root tree structure of switches, with end nodes residing at leaf switches. The switches are connected in a tree form, the number of the switches at different layers is different, the bandwidth of the cable between the different layers is also changed in equal proportion, but the total bandwidth of each layer is equal. In such a network topology, congestion can be reduced to a large extent if the bandwidth can be halved as much as possible between paths using a suitable routing scheme.
In this topology, when a packet is to be sent from one switch to another, there are often many possible paths, how to determine what is going? In this case, it is necessary to plan a transmission path of a packet in the subnetwork, so that a port leading to a next hop (next hop) can be clearly known after the packet reaches a switch in the whole process from a source node to a destination node. On the one hand, however, there are a number of problems with conventional fat-tree algorithm path planning, which assumes the same weight for all nodes in the network, but in practice some nodes may consume more traffic than others. On the other hand, if two heavily loaded node traffic is routed through the same link, so that a large number of packets all need to pass through this link at the same time, resulting in a link multiplexing situation, congestion may still occur.
By load balancing the fat tree structured downlink, link multiplexing may be reduced. In practice, however, many network topologies are not a standard fat-tree structure, and there are some test machines, development machines, machines for managing clusters, etc. in addition to computing machines and storage machines, within a large cluster for deep learning model training. They require a relatively much lower network bandwidth and are typically accessed to the core network using lower bandwidth network cards or other network topologies. The fat tree algorithm path planning in the prior art cannot well cope with the problem that the route cannot be calculated according to the application type of the machine, and more than two network cards are distributed to the same link to generate network congestion. How to reduce the occurrence and the influence range of congestion in a nonstandard fat tree network topology by reasonably planning the path of the traffic is a problem to be solved.
Disclosure of Invention
In order to overcome the defects, the invention aims to provide a method and equipment for planning a path of a high-speed network topology structure, and the method and equipment realize better play of network topology characteristics in fat tree logic blocks by splitting an integral non-standard network topology physical sub-network into fat tree logic blocks and other tree structure logic blocks, improve network performance of other tree structure logic blocks, reduce occurrence of network congestion on the whole and solve the problem that network congestion can directly lead to failure of GPU computing power in a large-scale cluster and distributed training scene.
The invention achieves the aim through the following technical scheme: a method for planning a path of a high-speed network topology structure aims at planning the path of a network of an IB subnet whole nonstandard fat tree topology, and comprises the following steps:
(1) Dividing the whole subnet into a plurality of logic blocks in a mode of being divided into one or more fat tree topology logic blocks as far as possible, and dividing the rest part into one or more tree logic blocks;
(2) The method comprises the steps of realizing path planning in each logic block, and obtaining path planning from each node in the logic block to any node in the block;
(3) The path planning between the logic blocks is realized, and the forwarding path planning between any two logic blocks is obtained;
(4) Combining path planning between the inside of the logic block and the logic block to obtain the whole path from any source node to any destination node;
(5) And constructing a routing table based on the whole path and transmitting the routing table to the switch in the subnet.
Preferably, the division of the logic blocks in the step (1) is performed only logically, and does not physically generate isolation, and does not need to use a router on the top layer.
Preferably, in the step (2), each logic block internal path planning adopts a load balancing mode for the fat tree topology logic block internal path planning, and different weights are given to the bandwidth requirements of each end node; traversing fat tree from end node upwards, and sorting descending port of previous exchanger in descending order, selecting the least weight in the descending port; the weight of the target end node is accumulated on the assigned downstream port and then the port assignment continues to be traced back up.
Preferably, for the port connected to other logic block through the connection line, the other end of the connection line is used as a network card of the device connected to the network end of the logic block.
Preferably, when the downlink port weights are the same, the local identifiers may be sequentially selected in ascending order.
Preferably, a shortest path planning mode is adopted for other tree logic blocks except fat tree topology logic blocks in the subnetwork.
Preferably, the step of obtaining the forwarding path plan between the logic blocks in the step (3) specifically includes:
(301) Enumerating any two logic blocks, namely a source logic block and a target logic block;
(302) Judging whether a path planning exists between the source logic block and the target logic block, and starting a step (303) when the path planning exists, and starting the path planning between the logic blocks; step (305) is entered when path planning exists, and the next two logic blocks are enumerated;
(303) Solving the shortest path from the source logic block to the target logic block;
(304) Tracing upwards from the target logic block, and calculating a forwarding path rule between each two adjacent blocks until tracing to the source logic block;
(305) And selecting two logic blocks from the rest logic blocks to perform path planning among the logic blocks, and repeating the step (301) until the path planning among all the logic blocks is completed, so as to complete the path planning among the logic blocks.
Preferably, the subnet route planning is implemented by maintaining a routing forwarding table in the switch, and the routing forwarding table of each switch contains entries of all nodes, wherein the entries record the address of the target node and the forwarding port corresponding to the next hop, and the route planning of the whole subnet is constructed, namely, the routing forwarding table of each switch in the subnet is constructed.
Preferably, each time the network topology is updated, the overall network routing plan is performed once, and all the switch routing tables in the subnet are updated.
An electronic device for implementing the method of any of the above claims, comprising a memory, a processor, a bus, a network interface, other peripheral interfaces;
The memory, the processor, the network interface, and other peripheral interfaces are connected through a communication bus, and the processor implements any of the steps described above when executing the program.
The invention has the beneficial effects that: aiming at an IB network, the invention adopts a mode of splitting an integral nonstandard network topology physical sub-network into a fat tree topology logic block and a tree structure logic block and respectively planning paths between the inside of the logic block and the logic block, thereby solving the problems that the load balance is difficult to realize through the fat tree structure on the whole and the characteristics of the network topology are difficult to be exerted under the condition that the sub-network is a large-scale cluster trained by a deep learning model and provided with multiple types of network equipment such as calculation nodes, storage nodes, test, development and management. By splitting the whole nonstandard network topology physical sub-network into fat tree logic blocks and other tree structure logic blocks, the network topology characteristics are better exerted in the fat tree logic blocks, the network performance of other tree structure logic blocks is improved, the occurrence of network congestion is reduced on the whole, the whole throughput is improved, the network configuration is optimized, the higher bandwidth is realized, and the whole use efficiency of the cluster is obviously improved.
Drawings
Fig. 1 is an entity diagram according to a first embodiment of the present disclosure.
Fig. 2 is a schematic diagram of a standard fat tree topology.
Fig. 3 is a schematic diagram of a fat tree topology in a network environment according to an embodiment of the invention.
Fig. 4 is a method of path planning between logic blocks in accordance with an embodiment of the present disclosure.
Fig. 5 is a schematic diagram of logical partitioning in a subnet according to an embodiment of the invention.
Fig. 6 is a forwarding path planning method between neighboring blocks according to an embodiment of the present invention.
Fig. 7 is a schematic diagram of forwarding path planning between neighboring blocks according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
The following description of the invention uses InfiniBand (IB) networks as examples of high performance networks. It will be clear to those skilled in the art that other types of high performance networks may be used without limitation. It will be clear to a person skilled in the art that other types of fabric topologies may be used without limitation.
Example 1: fig. 1 is an entity diagram of a first embodiment of the disclosure that provides a method of high-speed network topology path planning for an IB subnet-integrated non-standard fat-tree topology network.
The method comprises the following steps:
and 1, dividing the whole subnet into a plurality of logic blocks, wherein the dividing mode is that the whole subnet is divided into one or more fat tree topology logic blocks as far as possible, and the rest part is divided into one or more tree logic blocks.
And 2, realizing path planning in each logic block, and acquiring path planning from each node in the logic block to any node in the block.
And 3, realizing path planning among the logic blocks, and acquiring forwarding path planning among any two logic blocks.
And 4, combining path planning between the inside of the logic block and the logic block, and obtaining the whole path from any source node to any destination node.
And 5, constructing a routing table based on the whole path and issuing the routing table to the switch in the subnet.
The present invention will be described in more detail with reference to the following examples.
The method comprises the following steps:
and 1, dividing the whole subnet into a plurality of logic blocks, wherein the dividing mode is that the whole subnet is divided into one or more fat tree topology logic blocks as far as possible, and the rest part is divided into one or more tree logic blocks.
In the embodiment, load balancing is performed in the fat tree, so that the topology characteristic of the fat tree network can be well exerted. By dividing the subnetwork into as many fat tree topologies as possible, the maximum exertion of the fat tree topology characteristics is achieved. For the remaining portion, it is divided into one or more tree structured logical blocks. Different path planning strategies are adopted in different types of logic blocks respectively, so that the network topology characteristics of the subnetwork can be optimized better on the whole.
InfiniBand (IB) is an open standard lossless network technology developed by the InfiniBand trade organization. The technology is based on a serial point-to-point full duplex interconnect architecture that provides high throughput and low latency communications, particularly suited for High Performance Computing (HPC) cluster applications and data centers. Within the IB subnetwork, the host nodes are connected using switches and point-to-point links, and there is a Subnetwork Manager (SM) resident on the subnetwork device specified in the subnetwork. The subnet manager is responsible for configuring, activating, and maintaining IB subnets. The Subnet Manager (SM) may be responsible for performing routing table calculations in the IB fabric.
Fat tree topology (Fattree) a network topology commonly used for large AI clusters today, as shown in fig. 2, employs a plurality of switches connected in a tree, different layers have different numbers of switches, and the bandwidths of the cables between different layers also change in equal proportion, but the total bandwidth of each layer is equal. This is a non-blocking network topology that can maintain a completely bisected bandwidth and potentially avoid congestion. In practice, in order to realize a larger bandwidth, a fat tree topology generally has a plurality of wires between two nodes, wherein the wires can be network wires in a physical sense, and the plurality of network wires are combined to be used as a large network wire; the large switch is also broken down into multiple small switches.
In practice, however, many network topologies are not a standard fat tree structure. In a large cluster where deep learning model training is performed, there may be some test machines, development machines, machines for managing the cluster, and the like in addition to computing machines and storage machines. They require a relatively much lower network bandwidth and typically use a lower bandwidth network card or otherwise access them to the core network. IB does not support well this nonstandard topology, fat tree topology path planning does not handle this nonstandard network topology well, it does not calculate routes according to the application type of the machine. If the total bandwidth of the network cards of the machine is larger than the cross-sectional bandwidth of the fat tree, the routing table obtained by the fat tree topology routing algorithm often distributes two network cards to the same link randomly, and if the two network cards are exactly heavy-load network cards, congestion can be generated.
In this embodiment, the IB network architecture low-level architecture is referred to as a subnet, as shown in fig. 3, which includes a switch and a series of host devices connected point-to-point to the end of the subnet, including but not limited to computing machines, storage machines, test computers, development computers, computers that manage clusters, etc. The switch includes a plurality of switch ports and the host device may include one or more network cards. One switch port is connected to another switch port, or to a network card of another host device, by a wire, which may be a network cable in a physical sense. For ease of understanding, during packet transmission, forwarding and receiving, both the switch and the network card may be considered a node in a subnet, where the network card on the device is typically connected to the end of the subnet, which is also referred to as an end node for ease of description.
In this embodiment, the IB subnet may also include at least one Subnet Manager (SM) responsible for initializing and starting up the network, including the configuration of all switches, routers and Host Channel Adapters (HCAs) within the subnet. Host devices and switches in the subnetwork may be addressed using a specified Local Identifier (LID).
Subnet path planning is accomplished by maintaining routing forwarding tables within the switch. The routing forwarding table of each switch contains entries for all nodes that record the address of the target node and its forwarding port corresponding to the next hop. And constructing a path plan of the whole subnet, namely constructing a routing forwarding table of each switch in the subnet. Because the entry records the address of the target node and the forwarding port corresponding to the address in the routing forwarding table entry, only the target node is known when the routing planning is performed, and therefore, the path planning is usually traced upwards from the target node.
The subnetwork is divided into a plurality of logic blocks, the parts conforming to the fat tree network topology and the other parts of the network topology are mutually isolated and segmented, the subnetwork is divided into one or more fat tree topology logic blocks as far as possible, and the rest of the other network topology parts are divided into one or more tree-shaped logic blocks. It is noted that the division of the logical blocks is only logical and does not physically create isolation nor require the use of routers at the top level.
And 2, realizing path planning in each logic block, and acquiring path planning from each node in the logic block to any node in the block.
In this embodiment, path planning in each logic block needs to be implemented separately, and an optimal path planning is selected for different types of logic blocks, so as to solve the problem that in the logic block, which link should be selected from any node to select a path from a destination node. The connection line can be a network line in a physical sense, a plurality of connection lines can exist between each two adjacent switches, and two ends of each connection line are respectively connected with two ports.
In this embodiment, by acquiring a path plan between all nodes in the logic block, the routing forwarding table of all switches in the logic block includes forwarding port entries corresponding to all nodes in the logic block as target nodes.
In the embodiment, path planning is realized by adopting a traditional fat tree algorithm based on load balancing improvement mode in the fat tree topology logic block. In particular by giving each end node a different weight to its bandwidth requirements. Traversing fat tree from end node upwards, and sorting descending port of previous exchanger in descending order of weight, selecting smallest weight in descending port. And adds the weight of the target end node to the assigned downstream port. And continuing to trace back port allocation until the port allocation of the switch at the top end of the fat tree is completed, and performing the allocation of the next node.
In this embodiment, different weights are given to the connection ports according to the difference of the devices corresponding to the end nodes and the difference of the bandwidth requirements. Devices that participate in model training and have high bandwidth requirements, such as computing machines, storage machines, etc., may be given a high weight, such as 2. Devices that have a general bandwidth requirement, such as test machines, may be given a medium weight, such as 1. For development computers, computers managing clusters, etc. that have little requirement for bandwidth, a lower weight, such as a weight of 0, is given.
In this embodiment, in order to better balance the load between the logic blocks, for the ports connected to other logic blocks through the connection line, when the internal routing forwarding table of the logic block is calculated, the whole other end of the connection line needs to be considered as a network card of the device connected to the network end of the logic block. That is, when weight is given, the bandwidth required for connecting to other logic blocks through the connection needs to be considered, and for convenience of implementation, the other end of the connection can be regarded as a network card as a whole.
In this embodiment, each end node is connected to a switch port by a wire. Because of the IB network rule restrictions, only the target node is known when route planning is performed, and thus path planning typically traverses the tree up from the target node.
In this embodiment, when the downlink port weights are the same, the port Local Identifiers (LIDs) may be selected according to the conventional fat-tree routing algorithm in ascending order. The downlink ports with the same weight are ordered according to the Local Identifiers (LIDs) in the logic blocks in ascending order, and sequentially selected.
By the method, the problems of unbalanced link load and contention for an uplink of a single node in a fat tree structure are solved, and the full utilization of the characteristics of the network topology structure is realized.
In this embodiment, a shortest path planning mode is adopted for tree logic blocks other than fat tree topology logic blocks in the subnetwork. Inside these non-fat tree logic blocks, a tree structure connection mode is adopted between the switch and the end node, and the routing rule of the tree topology generally adopts the shortest path, because only one shortest path exists between any two nodes of the tree. A shortest path is determined by means of the shortest path, and the path from the source node to the target node directly uses the shortest path. The implementation method for shortest path planning of tree-structured network topology belongs to the prior art for those skilled in the art, and is not described herein in detail.
In this embodiment, there are multiple wires between adjacent switches, and each wire connects two ports of two switches respectively. The connection may be a network cable in a physical sense. The downstream ports of the previous switch are allocated in turn for the end nodes according to allocation rules. The allocation rule may be sequentially allocated according to the ascending order of the weights, so as to realize load balancing, or may be sequentially allocated according to the ascending order of the Local Identifiers (LIDs) in the classical routing algorithm.
Thus, both path planning and forwarding rules within the logic block are determined.
And 3, realizing path planning among the logic blocks, and acquiring forwarding path planning among any two logic blocks.
In this embodiment, after completing the path planning in the logic blocks, the path planning between the logic blocks needs to be completed, and fig. 4 shows a path planning method between the logic blocks according to another embodiment of the present invention, which specifically includes the following steps:
301, enumerating any two logic blocks, namely a source logic block and a target logic block.
302 Determines whether there is an inter-block path plan from the source logical block to the target logical block, and if not, starts 303, and starts the inter-block path plan. If a path has been planned, step 305 is entered, enumerating the next two logical blocks.
And the path planning among the logic blocks is a forwarding route path rule between any node in the source logic block and any node in the target logic block. In the path planning, a source node and a target node of the path planning are planned, wherein the source node belongs to a source logic block, and the target node belongs to a target logic block. As shown in fig. 5, the source node x is a node in the logical block a, and the target node Y is a node in the logical block Y, and it is assumed that a path from the logical block a where the source node x is located to the logical block E where the target node Y is located needs to be planned. Because of the IB network rule restrictions, only the target node is known when route planning is performed, and therefore path planning is typically traversed from the target logical block E where the target node y is located.
In this embodiment, it is first determined whether a path plan exists from the logical block a where the source node x is located to the logical block E where the target node y is located. It should be noted that, since IB rule determines that only the destination node and the forwarding port corresponding to the next hop (next hop) are recorded in the routing forwarding table, when the destination node and the source node are switched, a new independent path plan is included, i.e. the path plan from the source node x of the logical block a to the destination node y of the logical block E and the path plan from the source node y of the logical block E to the destination node x of the logical block a are two different path plans.
303 Find the shortest path from the source logical block to the target logical block.
In this embodiment, the shortest path from the source logical block a where the source node x is located to the logical block E where the target node y is located needs to be required.
In this embodiment, for a subnet composed of a plurality of logical blocks, the entire logical block may be regarded as one node, and the subnet may be regarded as a mesh topology composed of a plurality of nodes, as shown in fig. 5. The logic blocks are connected through a plurality of connecting lines, so that the shortest path topology planning description is facilitated, and the shortest path topology planning description is simplified into one. The shortest path from logic block A to logic block E is determined by the shortest path algorithm and is forwarded through logic block C, namely, the route of A-C-E.
In this embodiment, there are many ways to implement the shortest path algorithm of mesh topology for those skilled in the art, including but not limited to FLOYD veronide algorithm, etc.
304 Trace back from the target logical block, calculate the forwarding path rules between each neighboring block until tracing back to the source logical block.
In this embodiment, the forwarding path rule from the last logic block C to the target node y is obtained by tracing back from the target node y, and then the forwarding path rule from the logic block a to the logic block C is obtained. Fig. 6 illustrates a path planning implementation tracing from a target logical block E up to a logical block C in one embodiment, and the specific method includes:
3041 enumerating any node in the target logical block as the target node.
In this embodiment, the target logical block E includes the target node y, and the next forwarding point in all switch paths in the logical block C with y as the destination needs to be found.
3042 Allocates the previous logical block to a link of the target logical block based on the shortest path determined between the previous logical blocks.
In this embodiment, there is more than one connection between the logic blocks, so as to solve the path planning problem between the previous logic block and the target logic block in the shortest path, and first, the connection between the previous logic block and the target logic block needs to be selected according to the determination. After the path planning of the target logic block is obtained, the path planning from any switch of the target logic block class to the target node can be obtained according to the path planning in the logic block;
In this embodiment, as shown in fig. 7, the logical block E is a target logical block, and y in the logical block E is a target node. The target node y accesses the subnet through the switch E0. The shortest path defined between the aforementioned logical blocks is a-C-E, and therefore a connection from logical block C to target logical block E needs to be allocated. There may be multiple connections from the previous logical block C to the target logical block E. The connection mode can be different ports of different switches or different ports of the same switch. As shown in fig. 7, the connection between logical block C and logical block E may be through a 701 connection of port C101 of logical block C switch C1 to E1 switch port E101 in logical block E. Or to E1 switch port E102 in logical block E through the 702 wiring of port C201 of logical block C switch C2. The 703 connection through port C102 of logical block C switch C1 connects to E2 switch port E201 in logical block E, the 704 connection through port C202 of logical block C switch C2 connects to E2 switch port E202 in logical block E, or both 705,706 connections.
The allocation method can select LID sequence allocation, and can also consider load balancing to give weight to the connection line. In this embodiment, a connection line is determined from the connection lines 701-704 between the logic block C and the logic block E as a path to the target node y, and may be allocated according to the order of the connection lines LID, or may consider load balancing between the connection lines.
In this embodiment, the path planning between every two switches in the logic block inside the logic block E is already completed in the logic block inside path planning in step 2, that is, the path planning between the switch to the switch E0 and the target node y can be implemented whether through the switch E1 or the switch E2 or through the switch inside the other logic block E and the logic block C by determining the connection line.
3043 According to the allocated connection line, the switch port of the previous logic block corresponding to the connection line is regarded as the outlet port.
The exit port of logical block C is determined in this embodiment from one of the connections 701-704 between logical block C and logical block E. For example, if the distribution line is 701, the corresponding egress port C101 of the switch C1 in the logic block C is the egress port of the path in the logic block C.
3044 Enumerates each switch in the previous logical block, determining the logical block internal path to the egress port.
In this embodiment, according to the path planning in the logic block in the foregoing step 2, the path from any switch to the egress port may be obtained.
3045 Is replaced in the switch route forwarding entry, and the exit port is replaced by the target node, so that the path planning from any switch in the previous logic block to the target node can be obtained.
3046 Trace back in the same way to obtain the path planning from the previous logic block to the target node.
In this way, in this embodiment, the path planning between any node in the source logical block can be traced back from the target logical block.
And 305, selecting two logic blocks from the rest logic blocks to perform path planning among the logic blocks, and repeating 301 until the path planning among all the logic blocks is completed, thereby completing the path planning among the logic blocks.
And 4, combining path planning between the inside of the logic block and the logic block, and obtaining the whole path from any source node to any destination node.
And 5, constructing a routing table based on the whole path and issuing the routing table to the switch in the subnet.
In this embodiment, after the path planning between the inside of the combined logic block and the logic block is obtained, the routing table may be constructed to obtain forwarding entries of all switches in the subnet, so that after the data packet is obtained by any switch in the subnet, the forwarding port of the next hop (next hop) can be known through the forwarding entries of the switch, and then data forwarding from the source node to the target node is realized.
In this embodiment, the entire path construction routing table is issued by the Subnet Manager (SM) to the subnet internal switch.
It should be noted that, every time the network topology is updated, including but not limited to shutdown of the host device, failure of the host switch, addition and subtraction of hosts and switches, and change of the network topology, it is necessary to perform an overall network routing plan and update all the switch routing tables in the subnetwork.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present invention may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes. It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
The foregoing is considered as illustrative of the principles of the present invention, and has been described herein before with reference to the accompanying drawings, in which the invention is not limited to the specific embodiments shown.
Claims (10)
1. The method for planning the path of the high-speed network topology structure is characterized by aiming at the path planning of the network of the IB subnet whole nonstandard fat tree topology, and comprises the following steps:
(1) Dividing the whole subnet into a plurality of logic blocks in a mode of being divided into one or more fat tree topology logic blocks as far as possible, and dividing the rest part into one or more tree logic blocks;
(2) The method comprises the steps of realizing path planning in each logic block, and obtaining path planning from each node in the logic block to any node in the block;
(3) The path planning between the logic blocks is realized, and the forwarding path planning between any two logic blocks is obtained;
(4) Combining path planning between the inside of the logic block and the logic block to obtain the whole path from any source node to any destination node;
(5) And constructing a routing table based on the whole path and transmitting the routing table to the switch in the subnet.
2. The method of claim 1, wherein the partitioning of the logic blocks in step (1) is performed only logically, and does not physically create isolation, nor does it require routers at the top level.
3. The method for planning paths of a high-speed network topology according to claim 1 or 2, wherein in the step (2), each logic block is internally routed, and a load balancing manner is adopted for the routing of the internal paths of the fat tree topology logic blocks, and different weights are given to the bandwidth requirements of each end node; traversing fat tree from end node upwards, and sorting descending port of previous exchanger in descending order, selecting the least weight in the descending port; the weight of the target end node is accumulated on the assigned downstream port and then the port assignment continues to be traced back up.
4. A method of path planning in a high-speed network topology according to claims 1-3, characterized in that for ports connected to other logic blocks by wires, the other end of the wires is taken as a whole as a network card for the devices connected to the network end of the logic block.
5. A method for path planning in a high-speed network topology according to claims 1-4, characterized in that when the downstream port weights are the same, they can be selected in order of local identifiers.
6. A method for planning a path of a high-speed network topology according to claims 1-5, wherein a shortest path planning mode is adopted for tree-shaped logical blocks other than fat-tree-shaped logical blocks in the subnetwork.
7. A method for path planning in a high-speed network topology according to claims 1-6, wherein the step of obtaining a forwarding path plan between logic blocks in step (3) is specifically as follows:
(301) Enumerating any two logic blocks, namely a source logic block and a target logic block;
(302) Judging whether a path planning exists between the source logic block and the target logic block, and starting a step (303) when the path planning exists, and starting the path planning between the logic blocks; step (305) is entered when path planning exists, and the next two logic blocks are enumerated;
(303) Solving the shortest path from the source logic block to the target logic block;
(304) Tracing upwards from the target logic block, and calculating a forwarding path rule between each two adjacent blocks until tracing to the source logic block;
(305) And selecting two logic blocks from the rest logic blocks to perform path planning among the logic blocks, and repeating the step (301) until the path planning among all the logic blocks is completed, so as to complete the path planning among the logic blocks.
8. A method according to claims 1-7, characterized in that the path planning of the subnetwork is implemented by maintaining routing forwarding tables inside the switches, each of which contains entries for all nodes, said entries recording the address of the target node and its forwarding port corresponding to the next hop, and constructing the path planning of the entire subnetwork, i.e. the routing forwarding table of each switch inside the subnetwork.
9. The method of claim 8, wherein each time the network topology is updated, the overall network routing is performed once and all switch routing tables in the subnetwork are updated.
10. An electronic device for implementing the method of any one of claims 1-9, comprising a memory, a processor, a bus, a network interface, other peripheral interfaces;
memory, processor, network interface, other peripheral interfaces are connected via a communication bus, said processor implementing the steps of any of claims 1-9 when executing said program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410184204.9A CN118075198A (en) | 2024-02-19 | 2024-02-19 | Method and equipment for planning path of high-speed network topology structure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410184204.9A CN118075198A (en) | 2024-02-19 | 2024-02-19 | Method and equipment for planning path of high-speed network topology structure |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118075198A true CN118075198A (en) | 2024-05-24 |
Family
ID=91108873
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410184204.9A Pending CN118075198A (en) | 2024-02-19 | 2024-02-19 | Method and equipment for planning path of high-speed network topology structure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118075198A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118714020A (en) * | 2024-08-21 | 2024-09-27 | 浙江大学 | A computer network topology structure and computing domain division method, and a computer system |
-
2024
- 2024-02-19 CN CN202410184204.9A patent/CN118075198A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118714020A (en) * | 2024-08-21 | 2024-09-27 | 浙江大学 | A computer network topology structure and computing domain division method, and a computer system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10757022B2 (en) | Increasingly minimal bias routing | |
Alicherry et al. | Network aware resource allocation in distributed clouds | |
CN102282810B (en) | Load balancing | |
Carpio et al. | Balancing the migration of virtual network functions with replications in data centers | |
US8855116B2 (en) | Virtual local area network state processing in a layer 2 ethernet switch | |
Guay et al. | vFtree-A fat-tree routing algorithm using virtual lanes to alleviate congestion | |
US9083626B1 (en) | System and method for reducing hardware table resources in a multi-stage network device | |
US5617413A (en) | Scalable wrap-around shuffle exchange network with deflection routing | |
US11005724B1 (en) | Network topology having minimal number of long connections among groups of network elements | |
CN103346967B (en) | A kind of data center network topology structure and method for routing thereof | |
CN105471747B (en) | A kind of intelligent router route selecting method and device | |
CN109347657B (en) | Method for constructing virtual data domain of scientific and technological service under SDN mode | |
Skeie et al. | LASH-TOR: A generic transition-oriented routing algorithm | |
Cheng et al. | NAMP: Network-aware multipathing in software-defined data center networks | |
JP2017500816A (en) | Traffic engineering for large data center networks | |
CN102185772A (en) | Method for routing data centre network system | |
Ferraz et al. | A two-phase multipathing scheme based on genetic algorithm for data center networking | |
Zahid et al. | Efficient network isolation and load balancing in multi-tenant HPC clusters | |
CN118075198A (en) | Method and equipment for planning path of high-speed network topology structure | |
Chung et al. | Dynamic parallel flow algorithms with centralized scheduling for load balancing in cloud data center networks | |
Zahid et al. | Partition-aware routing to improve network isolation in infiniband based multi-tenant clusters | |
US20240154906A1 (en) | Creation of cyclic dragonfly and megafly cable patterns | |
Castillo | Various Network Topologies and an Analysis Comparative Between Fat-Tree and BCube for a Data Center Network: An Overview | |
US7336658B2 (en) | Methods and system of virtual circuit identification based on bit permutation of link numbers for multi-stage elements | |
CN118250215A (en) | Method and equipment for planning paths between logic blocks of network topology structure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |