[go: up one dir, main page]

CN114721828A - Distributed reconstruction resource management method based on cloud equipment - Google Patents

Distributed reconstruction resource management method based on cloud equipment Download PDF

Info

Publication number
CN114721828A
CN114721828A CN202210379557.5A CN202210379557A CN114721828A CN 114721828 A CN114721828 A CN 114721828A CN 202210379557 A CN202210379557 A CN 202210379557A CN 114721828 A CN114721828 A CN 114721828A
Authority
CN
China
Prior art keywords
node
reconstruction
nodes
distributed
resource information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210379557.5A
Other languages
Chinese (zh)
Other versions
CN114721828B (en
Inventor
张悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Sinogram Medical Technology Co ltd
Original Assignee
Jiangsu Sinogram Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Sinogram Medical Technology Co ltd filed Critical Jiangsu Sinogram Medical Technology Co ltd
Priority to CN202210379557.5A priority Critical patent/CN114721828B/en
Publication of CN114721828A publication Critical patent/CN114721828A/en
Application granted granted Critical
Publication of CN114721828B publication Critical patent/CN114721828B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer And Data Communications (AREA)
  • Multi Processors (AREA)

Abstract

The invention relates to a distributed reconstruction resource management method based on cloud equipment, which is applied to a distributed reconstruction system for reconstructing medical images in medical image equipment, wherein a plurality of nodes for realizing data reconstruction programs are configured in the distributed reconstruction system, all the nodes are positioned in a broadcast domain of the distributed reconstruction system, and available resource information is periodically broadcast in a broadcast mode, and the method comprises the following steps: the first node and each node in the broadcast domain update respective available resource information in real time; if the first node in the broadcast domain receives the local reconstruction task, performing data segmentation on the local reconstruction task and creating a thread in segmented data according to the resource allocation strategy, the available resource information of the current first node and the available resource information of other nodes in the broadcast domain so as to complete the local reconstruction task in a distributed manner. The method improves the data processing speed and solves the problem of unbalanced computing resources.

Description

一种基于云设备的分布式重建资源管理方法A distributed reconstruction resource management method based on cloud equipment

技术领域technical field

本发明涉及分布式重建技术领域,尤其涉及一种基于云设备的分布式重建资源管理方法。The present invention relates to the technical field of distributed reconstruction, and in particular, to a method for managing distributed reconstruction resources based on cloud devices.

背景技术Background technique

PET-CT是一种结合了PET和CT的核医学影像设备。其中PET(Positron EmissionTomography,正电子发射断层显像)负责采集具有功能显像功能的PET序列;CT(X-rayComputed Tomography,X射线断层扫描显像)负责采集具有结构显像功能的CT序列。PET数据采集完成后,算法模块利用该PET原始数据做重建,生成PET图像。重建过程中还要利用CT图像为PET图像重建做衰减校正。PET-CT is a nuclear medicine imaging device that combines PET and CT. Among them, PET (Positron Emission Tomography, positron emission tomography) is responsible for collecting PET sequences with functional imaging function; CT (X-rayComputed Tomography, X-ray tomography) is responsible for collecting CT sequences with structural imaging function. After the PET data collection is completed, the algorithm module uses the PET raw data for reconstruction to generate a PET image. In the reconstruction process, CT images are also used to perform attenuation correction for PET image reconstruction.

PET扫描时得到的原始数据会被实时传输至PET重建工作站,该工作站负责存储PET原始数据及重建所需的各种系统文件,以及将PET原始数据重建成图像。以一个较为常见的硬件配置为例,如IntelXeonW系列12核CPU,32GB内存,在这个硬件配置下,通常躯干扫描2分钟获得的数据量,使用OSEM(有序子集最大期望值方法)+TOF(飞行时间)重建方法,2次迭代,10子集,重建完成需要约2分钟。而且,由于PET重建工作站硬件资源有限,同一时间内只能处理一个重建,一旦扫描中包含多个重建,或者用户需要在扫描时对已有数据做多个后重建,则无法开始的重建只能进入队列等待,这种等待,往往要持续很久。近年来,随着重建算法的不断进步,各种高级算法也逐渐涌现,例如利用深度学习技术的图像优化算法。这些高级算法,由于内部计算的复杂化,在同等硬件配置下,需要消耗更多时间。The raw data obtained during PET scanning will be transmitted to the PET reconstruction workstation in real time, which is responsible for storing the PET raw data and various system files required for reconstruction, as well as reconstructing the PET raw data into images. Take a more common hardware configuration as an example, such as IntelXeonW series 12-core CPU, 32GB memory, under this hardware configuration, usually the amount of data obtained by torso scanning in 2 minutes, using OSEM (Ordered Subset Maximum Expected Value Method) + TOF ( time of flight) reconstruction method, 2 iterations, 10 subsets, reconstruction takes ~2 min to complete. Moreover, due to the limited hardware resources of the PET reconstruction workstation, only one reconstruction can be processed at the same time. Once the scan contains multiple reconstructions, or the user needs to perform multiple post-reconstructions on the existing data during scanning, the reconstruction that cannot be started can only be Entering the queue to wait, this kind of waiting often lasts for a long time. In recent years, with the continuous improvement of reconstruction algorithms, various advanced algorithms have gradually emerged, such as image optimization algorithms using deep learning technology. These advanced algorithms, due to the complexity of the internal calculation, need to consume more time under the same hardware configuration.

为缩短重建时间,提高重建效率,如何利用云计算实现远程重建成为当前亟需解决的技术问题。In order to shorten the reconstruction time and improve the reconstruction efficiency, how to use cloud computing to realize remote reconstruction has become a technical problem that needs to be solved urgently.

发明内容SUMMARY OF THE INVENTION

(一)要解决的技术问题(1) Technical problems to be solved

鉴于现有技术的上述缺点、不足,本发明提供一种基于云设备的分布式重建资源管理方法,用于实现云端重建中缩短重建时间、且提高重建效率,解决计算资源不均衡的技术问题。In view of the above shortcomings and deficiencies of the prior art, the present invention provides a distributed reconstruction resource management method based on cloud equipment, which is used to shorten reconstruction time and improve reconstruction efficiency in cloud reconstruction, and solve the technical problem of unbalanced computing resources.

(二)技术方案(2) Technical solutions

为了达到上述目的,本发明采用的主要技术方案包括:In order to achieve the above-mentioned purpose, the main technical scheme adopted in the present invention includes:

第一方面,本发明实施例提供一种基于云设备的分布式重建资源管理方法,所述分布式重建资源管理方法应用于医疗影像设备中用于医学影像重建的分布式重建系统,所述分布式重建系统中配置有多个实现数据重建程序的节点,所有节点位于一个分布式重建系统的广播域中,且采用广播方式周期广播可用资源信息,所述方法包括:In a first aspect, an embodiment of the present invention provides a distributed reconstruction resource management method based on a cloud device. The distributed reconstruction resource management method is applied to a distributed reconstruction system for medical image reconstruction in medical imaging equipment. The distributed reconstruction system is configured with a plurality of nodes that implement data reconstruction procedures, all nodes are located in the broadcast domain of a distributed reconstruction system, and the available resource information is broadcast periodically by broadcasting, and the method includes:

S10、第一节点周期性获取该第一节点内的可用资源信息并广播到所述广播域中其他节点,以及接收所述广播域内各节点广播的可用资源信息;S10, the first node periodically acquires the available resource information in the first node and broadcasts it to other nodes in the broadcast domain, and receives the available resource information broadcast by each node in the broadcast domain;

S20、所述第一节点判断是否接收到该第一节点连接的影像设备发送的本地重建任务;S20, the first node judges whether the local reconstruction task sent by the imaging device connected to the first node is received;

S30、若第一节点接收到本地重建任务,则根据资源分配策略、和当前第一节点的可用资源信息、广播域内其他节点的可用资源信息,对所述本地重建任务进行数据分割以及创建处于分割后数据的线程,以分布式完成所述本地重建任务;S30. If the first node receives the local reconstruction task, according to the resource allocation policy, the available resource information of the current first node, and the available resource information of other nodes in the broadcast domain, perform data segmentation on the local reconstruction task and create a partitioned post-data thread to complete the local reconstruction task in a distributed manner;

所述资源分配策略为用于根据本地重建任务的处理信息确定能够处理所述本地重建任务的一个以上节点,并根据选择的所述节点和该节点的处理效率、该节点的可用资源信息对所述本地重建任务进行分割。The resource allocation strategy is used to determine one or more nodes capable of processing the local reconstruction task according to the processing information of the local reconstruction task, and to assign all nodes according to the selected node and the processing efficiency of the node and the available resource information of the node. segmentation based on the local reconstruction task described above.

可选地,所述方法还包括:Optionally, the method further includes:

对本地重建任务处理的所有节点对各自分配的数据处理完成后,更新该节点内部的可用资源信息并广播;After all nodes processing the local reconstruction task complete the processing of the data allocated to them, update the available resource information inside the node and broadcast it;

其中,第一节点确定能够处理所述本地重建任务的一个以上节点时,所述第一节点的优先级为最高优先级。Wherein, when the first node determines more than one node capable of processing the local reconstruction task, the priority of the first node is the highest priority.

可选地,每一节点内维护有广播域内各节点的可用资源信息的数据结构和各节点的节点效率信息表;Optionally, each node maintains a data structure of available resource information of each node in the broadcast domain and a node efficiency information table of each node;

其中,第一节点内存储的各节点的节点效率信息表中的信息为该第一节点基于指定时长内每一节点处理的数据包计算得到的每一节点处理效率。The information in the node efficiency information table of each node stored in the first node is the processing efficiency of each node calculated by the first node based on the data packets processed by each node within a specified time period.

可选地,任意两个节点之间采用SOCKET通讯方式进行通讯;Optionally, the SOCKET communication method is used for communication between any two nodes;

所述可用资源信息包括下述的一种或多种:The available resource information includes one or more of the following:

CPU总内核数、当前可用内核数;节点的内存信息、空闲内存信息、响应速度、支持的算法。The total number of CPU cores, the number of currently available cores; the memory information, free memory information, response speed, and supported algorithms of the node.

第二方面,本发明实施例还提供一种服务器,所述服务器属于医疗影像设备中用于医学影像重建的分布式重建系统的一个节点,所述服务器位于一个广播域中,且采用广播方式向分布式重建系统中的其他节点广播可用资源信息,所述服务器包括:In a second aspect, an embodiment of the present invention further provides a server, where the server belongs to a node of a distributed reconstruction system for medical image reconstruction in a medical imaging device, the server is located in a broadcast domain, and broadcasts to Other nodes in the distributed reconstruction system broadcast available resource information, and the server includes:

分布式计算处理模块,用于与分布式重建系统中其他节点通讯,并接收该服务器连接的影像设备传输的本地重建任务,以及基于资源分配策略、和当前节点的可用资源信息、广播域内其他节点的可用资源信息,对所述本地重建任务进行数据分割以及创建处于分割后数据的线程和/或进程;The distributed computing processing module is used to communicate with other nodes in the distributed reconstruction system and receive local reconstruction tasks transmitted by the video equipment connected to the server, as well as other nodes in the broadcast domain based on the resource allocation strategy and the available resource information of the current node. the available resource information, perform data segmentation on the local reconstruction task and create threads and/or processes in the segmented data;

重建模块,用于根据分布式重建系统中的分布式服务接口调用分布式计算处理模块确定的至少一个节点的线程和/或进程对分割的本地重建任务进行处理以完成重建;A reconstruction module, configured to call the thread and/or process of at least one node determined by the distributed computing processing module according to the distributed service interface in the distributed reconstruction system to process the divided local reconstruction task to complete the reconstruction;

所述资源分配策略为用于根据本地重建任务的处理信息确定能够处理所述本地重建任务的一个以上节点,并根据选择的所述节点和该节点的处理效率、该节点的可用资源信息对所述本地重建任务进行分割。The resource allocation strategy is used to determine one or more nodes capable of processing the local reconstruction task according to the processing information of the local reconstruction task, and to assign all nodes according to the selected node and the processing efficiency of the node and the available resource information of the node. segmentation based on the local reconstruction task described above.

可选地,所述分布式计算处理模块包括:Optionally, the distributed computing processing module includes:

网络通讯单元,用于与分布式重建系统中的其他节点进行通信;a network communication unit for communicating with other nodes in the distributed reconstruction system;

计算资源管理单元,用于周期获取分布式重建系统中的其他节点的可用资源信息,并维护服务器内部存储的广播域内各节点的可用资源信息的数据结构和各节点的节点效率信息表;The computing resource management unit is used to periodically obtain the available resource information of other nodes in the distributed reconstruction system, and maintain the data structure of the available resource information of each node in the broadcast domain and the node efficiency information table of each node stored in the server;

计算资源动态利用单元,用于基于资源分配策略、和当前节点的可用资源信息、广播域内其他节点的可用资源信息确定能够处理所述本地重建任务的一个以上节点;A computing resource dynamic utilization unit, configured to determine one or more nodes capable of processing the local reconstruction task based on the resource allocation strategy, available resource information of the current node, and available resource information of other nodes in the broadcast domain;

数据分割与合并单元,用于根据选择的所有节点、该所有节点中每一节点的处理效率及可用资源信息对所述本地重建任务进行分割,并分发到选定的节点中进行重建处理。The data splitting and merging unit is used for splitting the local reconstruction task according to all the selected nodes, the processing efficiency and available resource information of each node in the all nodes, and distributes the local reconstruction tasks to the selected nodes for reconstruction processing.

可选地,所述网络通讯单元,用于采用SOCKET通讯方式进行通讯;或者,采用HTTPSOCKET通讯方式进行通讯;Optionally, the network communication unit is used to communicate in a SOCKET communication mode; or, communicate in an HTTPSOCKET communication mode;

计算资源管理单元,用于将本节点的可用资源信息和重建任务信息,以广播的形式向广播域中发送;并接收广播域中其他节点发送的广播包;The computing resource management unit is used to send the available resource information and reconstruction task information of the node to the broadcast domain in the form of broadcast; and receive broadcast packets sent by other nodes in the broadcast domain;

所述重建任务信息包括本地重建任务的信息或者其他节点分发的重建任务的信息;The reconstruction task information includes information of local reconstruction tasks or information of reconstruction tasks distributed by other nodes;

计算资源动态利用单元,用于在接收到本地重建任务,并确定重建算法之后,获取本节点能够使用的可用资源信息;以及广播域中其他节点的可用计算资源信息,并依据本地重建并发限制和广播域内重建并发限制确定本地服务器的可用资源信息和广播域中其他节点的可用资源信息,选择处理本地重建任务的所有节点,并创建用于处理本地重建任务的多个线程和/或进程;The computing resource dynamic utilization unit is used to obtain the available resource information that can be used by this node after receiving the local reconstruction task and determining the reconstruction algorithm; and the available computing resource information of other nodes in the broadcast domain, and according to the local reconstruction concurrency limit and Intra-broadcast domain reconstruction concurrency limit determines the available resource information of the local server and the available resource information of other nodes in the broadcast domain, selects all nodes processing local reconstruction tasks, and creates multiple threads and/or processes for processing local reconstruction tasks;

数据分割与合并单元,用于接收计算资源动态利用单元发送的线程和/或进程、选择的所有节点的可用资源信息、选择的所有节点中每一节点的处理效率,对本地重建任务进行分割并确定分配给选择的每个节点的数据大小以分发;The data splitting and merging unit is used to receive the threads and/or processes sent by the computing resource dynamic utilization unit, the available resource information of all the selected nodes, and the processing efficiency of each node in all the selected nodes, and to split and merge the local reconstruction task. Determine the size of data allocated to each node selected for distribution;

所述处理分割的数据的线程和/或进程部分运行在本地节点,部分运行在广播域中的其他节点上。The threads and/or processes that process the segmented data run partly on the local node and partly on other nodes in the broadcast domain.

可选地,还包括:Optionally, also include:

计算资源动态利用单元基于选择的所有节点的可用资源信息创建用于处理本地重建任务的最大线程数。The computing resource dynamic utilization unit creates a maximum number of threads for processing local reconstruction tasks based on available resource information of all selected nodes.

第三方面,本发明实施例还提供一种基于云设备的分布式重建系统,包括多个第二方面任一所述的服务器,各服务器作为一个节点组成去中心化的广播域,每一服务器对应一个以上的位于医院的影像设备。In a third aspect, an embodiment of the present invention further provides a distributed reconstruction system based on a cloud device, including a plurality of servers described in any one of the second aspect, each server serving as a node to form a decentralized broadcast domain, and each server Corresponds to more than one imaging equipment located in the hospital.

(三)有益效果(3) Beneficial effects

本发明实施例的方法通过广播域内各节点实时或周期性广播各自可用的资源信息,进而在有节点存在本地重建任务时,可以通过一个以上的节点进行共同处理,为此有效提高了本地重建任务的处理效率,减少的等待时间,同时,优选接收本地重建任务的节点进行处理,合理分布资源,合理利用分布式重建系统中的空闲资源。In the method of the embodiment of the present invention, each node in the broadcast domain broadcasts the available resource information in real time or periodically, so that when a node has a local reconstruction task, it can be jointly processed by more than one node, which effectively improves the local reconstruction task. At the same time, the node that receives the local reconstruction task is selected for processing, and the resources are distributed reasonably, and the idle resources in the distributed reconstruction system are rationally utilized.

另外,本发明实施例中,广播域内每个节点的服务就掌握了其广播域内所有节点的动态计算资源信息,此时,所有节点组成的域是去中心化的,域中的节点可灵活的动态增加、减少,而不会影响整体分布式计算的功能。In addition, in the embodiment of the present invention, the service of each node in the broadcast domain grasps the dynamic computing resource information of all nodes in the broadcast domain. At this time, the domain composed of all nodes is decentralized, and the nodes in the domain can be flexible Dynamically increase and decrease without affecting the functionality of the overall distributed computing.

附图说明Description of drawings

图1为本发明一实施例提供的基于云设备的分布式重建资源管理方法的流程示意图;1 is a schematic flowchart of a method for managing distributed reconstruction resources based on a cloud device according to an embodiment of the present invention;

图2为本发明另一实施例提供的基于云设备的分布式重建资源管理方法的流程示意图;2 is a schematic flowchart of a cloud device-based distributed reconstruction resource management method provided by another embodiment of the present invention;

图3为本发明一实施例提供的基于云设备的分布式重建系统的结构示意图;3 is a schematic structural diagram of a cloud device-based distributed reconstruction system provided by an embodiment of the present invention;

图4为本发明一实施例提供的分布式重建系统中一节点使用的时间片的示意图;4 is a schematic diagram of a time slice used by a node in a distributed reconstruction system according to an embodiment of the present invention;

图5为本发明一实施例提供的将待重建的任务分割成的数据包的示意图。FIG. 5 is a schematic diagram of dividing a task to be reconstructed into data packets according to an embodiment of the present invention.

具体实施方式Detailed ways

为了更好的解释本发明,以便于理解,下面结合附图,通过具体实施方式,对本发明作详细描述。In order to better explain the present invention and facilitate understanding, the present invention will be described in detail below with reference to the accompanying drawings and through specific embodiments.

随着云计算的多年发展和以5G为代表的新一代通信技术的逐步普及,利用强大的云计算资源和告诉网络来实现远程重建成为一种可实现的技术方案。With the development of cloud computing for many years and the gradual popularization of the new generation of communication technologies represented by 5G, the use of powerful cloud computing resources and communication networks to achieve remote reconstruction has become an achievable technical solution.

也就是说,很多类似PET-CT、PET这样的医疗影像设备,将不再配置本地工作站,而是使用云端的专有工作站。影像数据均使用云计算完成。此时,带来计算资源利用不均衡的问题。That is to say, many medical imaging equipment such as PET-CT and PET will no longer be configured with local workstations, but will use proprietary workstations in the cloud. The image data are all done using cloud computing. At this time, the problem of unbalanced utilization of computing resources is brought about.

举例来说,假设有公司A,销售、安装了100台设备,则对应将有100个云端工作站。由于这100台设备分布于不同地区的不用医院,患者数量、繁忙时间均有所不同,在某段时间内,100个云端工作站中,必然有些节点(即云端工作站)繁忙,有些节点空闲。甚至有些节点,由于患者数量少,大部分时间处于空闲状态。这种情况下,完全可以利用空闲节点的计算资源,为繁忙的节点分担部分计算任务,提高数据处理速度,更快向影像设备返回结果,提高数据处理性能。For example, if there is Company A, which sells and installs 100 devices, there will be 100 cloud workstations correspondingly. Since these 100 devices are distributed in different hospitals in different regions, the number of patients and the busy time are different. During a certain period of time, among the 100 cloud workstations, some nodes (ie cloud workstations) must be busy and some nodes are idle. Even some nodes, due to the small number of patients, are idle most of the time. In this case, the computing resources of idle nodes can be fully utilized to share some computing tasks for busy nodes, improve data processing speed, return results to imaging devices faster, and improve data processing performance.

本发明实施例中涉及的名称说明如下:The names involved in the embodiments of the present invention are described as follows:

节点:云端的一个服务用于对影像设备的数据重建,即用于图像重建计算的云计算工作站。本发明实施例中所有此类工作站均称为节点(即云设备)。Node: A service in the cloud is used for data reconstruction of imaging equipment, that is, a cloud computing workstation for image reconstruction calculations. All such workstations in the embodiments of the present invention are referred to as nodes (ie, cloud devices).

广播编号:为每个节点特有的唯一标识,运行在每个节点上的用于处理图像重建任务的服务用此编号标识自己,用于被其他节点的服务识别。Broadcast number: It is a unique identifier unique to each node. The service running on each node for processing image reconstruction tasks uses this number to identify itself, and is used to be identified by services of other nodes.

广播域:广播域是一组或一类广播编号的集合,每个节点的服务可以使用广播的方式,向一个或多个广播域中发送信息。Broadcast domain: A broadcast domain is a collection of a group or a class of broadcast numbers. The service of each node can use broadcast to send information to one or more broadcast domains.

本地重建并发限制:是一组数值,用于指定服务在节点本地最多可利用的CPU内核数以及相应的最大并发数。Local rebuild concurrency limit: is a set of values that specifies the maximum number of CPU cores that the service can utilize locally on the node and the corresponding maximum number of concurrency.

域内重建并发限制:是一组数值,用于指定服务除节点本地外,在其广播域内,最多可利用的CPU内核数以及相应的最大并发数。Intra-domain reconstruction concurrency limit: It is a set of values used to specify the maximum number of CPU cores and the corresponding maximum number of concurrency that can be used by the service in its broadcast domain in addition to the local node.

服务:一种程序运行模式,本实施例中所述的功能,以服务的形式运行在节点上。下面各实施例中提及的服务均可指以服务的形式运行在节点上的功能范畴。Service: a program running mode, the functions described in this embodiment run on the nodes in the form of services. The services mentioned in the following embodiments may all refer to functional categories running on nodes in the form of services.

时间片:是在执行重建任务规划时,调整对节点任务分配的时间间隔,如图5所示。在一个时间片内,对特定节点的使用方式(占用CPU内核数量、发送的数据大小)保持不变。由于每个节点开始、结束重建任务的时间不同步,本发明实施例中的时间片是针对每个节点独自规划的。时间片过短,会导致线程任务没有足够时间完成,算法在动态规划任务上消耗大量时间。时间片过长,会导致节点资源可能被非本地重建任务长时间占用,导致本地任务无法有限使用本地节点计算资源。本发明实施例中是基于各节点的节点处理效率为依据确定各节点的时间片信息。Time slice: It is the time interval for adjusting the assignment of node tasks when performing reconstruction task planning, as shown in Figure 5. Within a time slice, the usage of a specific node (the number of CPU cores occupied, the size of the data sent) remains unchanged. Since the times when each node starts and ends the reconstruction task are not synchronized, the time slice in the embodiment of the present invention is planned independently for each node. If the time slice is too short, the thread task will not have enough time to complete, and the algorithm consumes a lot of time on the dynamic programming task. If the time slice is too long, node resources may be occupied by non-local reconstruction tasks for a long time, resulting in limited use of local node computing resources by local tasks. In the embodiment of the present invention, the time slice information of each node is determined based on the node processing efficiency of each node.

实施例一Example 1

如图1和图2所示,本发明实施例提供一种基于云设备的分布式重建资源管理方法,在本实施例中,分布式重建资源管理方法应用于医疗影像设备(如PET-CT、PET、CT)中用于医学影像重建的分布式重建系统,所述分布式重建系统中配置有多个实现数据重建程序的节点(即云设备),所有节点位于一个分布式重建系统的广播域中,且采用广播方式周期广播可用资源信息,所述方法包括:As shown in FIG. 1 and FIG. 2 , an embodiment of the present invention provides a distributed reconstruction resource management method based on a cloud device. In this embodiment, the distributed reconstruction resource management method is applied to medical imaging equipment (such as PET-CT, A distributed reconstruction system for medical image reconstruction in PET, CT), the distributed reconstruction system is configured with a plurality of nodes (ie cloud devices) that implement data reconstruction procedures, and all nodes are located in the broadcast domain of a distributed reconstruction system and broadcast the available resource information periodically in a broadcast manner, the method includes:

S10、第一节点周期性获取该第一节点内的可用资源信息并广播到所述广播域中其他节点,以及接收所述广播域内各节点广播的可用资源信息。S10. The first node periodically acquires available resource information in the first node and broadcasts it to other nodes in the broadcast domain, and receives available resource information broadcast by each node in the broadcast domain.

具体地,每个节点上的服务,可按照既定的时间间隔(例如1s/2s/3s等),向广播域中广播可用资源信息。该可用资源信息可包括:本节点的计算资源信息,如CPU的内核数和负载,以及内存占用。Specifically, the service on each node may broadcast available resource information to the broadcast domain according to a predetermined time interval (for example, 1s/2s/3s, etc.). The available resource information may include: computing resource information of the node, such as the number of cores and load of the CPU, and memory occupation.

应说明的是,上述步骤中的第一节点也会收到广播域内其他节点广播的可用资源信息。通过广播可用资源信息的方式,广播域内每个节点的服务可实时获取到广播域内所有节点的动态计算资源信息。由此,实现广播域内所有节点是去中心化的,广播域中的节点可灵活的动态增加、减少,而不会影响整体分布式计算的功能。It should be noted that the first node in the above steps will also receive available resource information broadcast by other nodes in the broadcast domain. By broadcasting available resource information, the service of each node in the broadcast domain can obtain the dynamic computing resource information of all nodes in the broadcast domain in real time. As a result, all nodes in the broadcast domain are decentralized, and nodes in the broadcast domain can be dynamically increased and decreased without affecting the overall distributed computing function.

S20、所述第一节点周期性查看是否接收到本地重建任务。S20. The first node periodically checks whether a local reconstruction task is received.

通常,广播域内的每一个节点对应一个以上的安装在医院本地的影像设备,其用于接收影像设备上传的用于实现重建的数据。每一节点内的服务用于对接收的数据进行重建,本实施例中实现的是节点内的服务基于广播域内的资源实现对接收的一个待重建的任务进行重建。Generally, each node in the broadcast domain corresponds to one or more imaging equipment installed locally in the hospital, which is used for receiving data uploaded by the imaging equipment for realizing reconstruction. The service in each node is used to reconstruct the received data. In this embodiment, the service in the node realizes the reconstruction of a received task to be reconstructed based on the resources in the broadcast domain.

S30、若第一节点接收到本地重建任务,则根据资源分配策略、和当前第一节点的可用资源信息、广播域内其他节点的可用资源信息,对所述本地重建任务进行数据分割以及创建处于分割后数据的线程/或进程,以分布式完成所述本地重建任务;S30. If the first node receives the local reconstruction task, according to the resource allocation policy, the available resource information of the current first node, and the available resource information of other nodes in the broadcast domain, perform data segmentation on the local reconstruction task and create a partitioned post-data thread/or process to complete the local reconstruction task in a distributed manner;

所述资源分配策略为用于根据本地重建任务的处理信息确定能够处理所述本地重建任务的一个以上节点,并根据选择的所述节点和该节点处理效率、该节点的可用资源信息对所述本地重建任务进行分割。The resource allocation strategy is used to determine one or more nodes capable of processing the local reconstruction task according to the processing information of the local reconstruction task, and according to the selected node, the processing efficiency of the node, and the available resource information of the node. Local reconstruction task for segmentation.

当一组数据和其重建方法、重建参数等待重建的数据从医院本地的影像设备发送至对应的节点后,服务依据本地重建并发限制和域内重建并发限制,由节点本地和广播域的其他节点,获取用于实现该重建任务的可用资源信息。如果广播域内所有可用资源信息无法达到域内重建并发限制所设定的上限,则获取广播域内所能提供的最大计算资源(即可用资源信息);之后,将所有的可用资源信息提供给用于实现重建任务的第一节点内的服务(可称为原始数据重建服务)。When a set of data, its reconstruction method, and reconstruction parameters are waiting to be reconstructed from the hospital's local imaging device to the corresponding node, the service will send the data locally to the node and other nodes in the broadcast domain according to the local reconstruction concurrency limit and the intra-domain reconstruction concurrency limit. Get information about the available resources for this rebuild task. If all available resource information in the broadcast domain cannot reach the upper limit set by the intra-domain reconstruction concurrency limit, obtain the maximum computing resources (that is, available resource information) that can be provided in the broadcast domain; A service within the first node of the reconstruction task (may be referred to as a raw data reconstruction service).

举例来说,如果第一节点正在向广播域内其他节点提供计算资源而导致达不到本地重建并发限制的上限,则获取第一节点内最大的计算资源。即第一节点内的计算资源优先向第一节点接收的本地重建任务提供。如果广播域内所有计算资源无法达到域内重建并发限制所设定的上限,则获取广播域内所能提供的最大的可用资源信息,并将可用资源信息提供给第一节点内的原始数据重建服务。For example, if the first node is providing computing resources to other nodes in the broadcast domain and the upper limit of the local reconstruction concurrency limit is not reached, the largest computing resource in the first node is obtained. That is, the computing resources in the first node are preferentially provided to the local reconstruction task received by the first node. If all computing resources in the broadcast domain cannot reach the upper limit set by the intra-domain reconstruction concurrency limit, obtain the maximum available resource information that can be provided in the broadcast domain, and provide the available resource information to the original data reconstruction service in the first node.

此时,第一节点内的原始数据重建服务,依据得到的可用资源信息,创建线程/或进程,并行处理原始的待重建的数据。At this time, the original data reconstruction service in the first node creates threads and/or processes according to the obtained available resource information, and processes the original data to be reconstructed in parallel.

本实施例中,原始数据重建服务可以创建的最大线程数,以匹配本地重建并发限制加域内重建并发限制。最大线程数可由第一节点内原始数据重加服务能获取到的广播域内的可用资源信息决定。In this embodiment, the maximum number of threads that can be created by the original data reconstruction service matches the local reconstruction concurrency limit plus the intra-domain reconstruction concurrency limit. The maximum number of threads can be determined by the available resource information in the broadcast domain that can be obtained by the original data re-addition service in the first node.

基于上述描述,可理解的是,本实施例的方法还可包括下述的图中未示出的步骤S40:Based on the above description, it can be understood that the method of this embodiment may further include the following step S40 not shown in the figure:

S40:选择的节点对本地重建任务处理完成后,更新该节点内部的可用资源信息并广播;S40: After the selected node completes the processing of the local reconstruction task, update the available resource information inside the node and broadcast it;

其中,第一节点确定能够处理所述本地重建任务的一个以上节点时,所述第一节点的优先级为最高优先级。Wherein, when the first node determines more than one node capable of processing the local reconstruction task, the priority of the first node is the highest priority.

本实施例中,每一节点综合存储有广播域内各节点的可用资源信息和节点处理效率;所述节点处理效率是以一个线程在一个时间片内处理的数据量。即原始数据重建服务,依据得到的可用资源信息,创建线程,并行处理待重建的数据。原始数据重建服务可以创建的最大线程数。In this embodiment, each node comprehensively stores the available resource information of each node in the broadcast domain and the node processing efficiency; the node processing efficiency refers to the amount of data processed by one thread in one time slice. That is, the original data reconstruction service, based on the obtained available resource information, creates threads and processes the data to be reconstructed in parallel. The maximum number of threads that the raw data reconstruction service can create.

本实施例中,广播域内任意两个节点之间采用SOCKET通讯方式进行通讯;或者,任意两个节点之间采用HTTPSOCKET通讯方式进行通讯;In this embodiment, the SOCKET communication method is used for communication between any two nodes in the broadcast domain; or, the HTTPSOCKET communication method is used for communication between any two nodes;

所述可用资源信息(可称为计算资源/计算资源信息)包括下述的一种或多种:CPU总内核数、当前可用内核数;节点的内存信息、空闲内存信息、响应速度、支持的算法。The available resource information (may be referred to as computing resource/computing resource information) includes one or more of the following: the total number of CPU cores, the number of currently available cores; the memory information of the node, free memory information, response speed, supported algorithm.

可理解的是,在每一节点中服务对数据进行重建的过程中,服务还需决定待重建的数据(即第一节点接收的本地重建任务的原始数据)分割粒度。这里的粒度是指原始数据分割后,每个数据片段的长度。It is understandable that, in the process of reconstructing data by the service in each node, the service also needs to determine the division granularity of the data to be reconstructed (that is, the original data of the local reconstruction task received by the first node). The granularity here refers to the length of each data segment after the original data is divided.

本实施例中第一节点的分割算法可基于分割、合并效率和单任务处理时间确定本地重建任务的分割粒度。如果粒度过小,每一个节点内的服务将消耗大量资源在原始数据分割和计算结果的整合上。如果粒度过大,则会导致单个数据处理时间过长,使得广播域内节点的计算资源长时间被其他节点占用,导致节点本地出现重建任务时,无法及时获取到本地计算资源。In this embodiment, the segmentation algorithm of the first node may determine the segmentation granularity of the local reconstruction task based on segmentation, merging efficiency, and single-task processing time. If the granularity is too small, the services in each node will consume a lot of resources on raw data segmentation and calculation results integration. If the granularity is too large, the processing time of a single data will be too long, so that the computing resources of the nodes in the broadcast domain are occupied by other nodes for a long time, and the local computing resources cannot be obtained in time when the local reconstruction task occurs on the node.

本实施例的重建资源管理服务可作为一个中间件,存在于后重建程序和操作系统之间。对于后重建程序而言,它并不清楚并行执行的线程具体执行于本地节点的CPU上还是其他云设备中其他节点的CPU上。服务通过特有的分布式计算算法,整合云节点,充分利用闲置节点的计算资源,提高后重建速度。The reconstruction resource management service in this embodiment can be used as a middleware and exists between the post-rebuild program and the operating system. For the post-reconstruction procedure, it is not clear whether the threads executing in parallel are executed on the CPU of the local node or the CPU of other nodes in other cloud devices. The service integrates cloud nodes through a unique distributed computing algorithm, makes full use of the computing resources of idle nodes, and improves post-reconstruction speed.

实施例二Embodiment 2

本实施例中是使用云计算工作站执行医院本地的PET重建任务/PET-CT重建任务时,充分利用云端工作站的计算资源,实现负载均衡,提高重建速度的一种系统。In this embodiment, when the cloud computing workstation is used to perform the local PET reconstruction task/PET-CT reconstruction task in the hospital, the computing resources of the cloud workstation are fully utilized to achieve load balance and improve the reconstruction speed.

本实施例的服务器即云设备属于医疗影像设备中用于医学影像重建的分布式重建系统中的一个节点,节点内服务(即程序)用于实现PET-CT数据的重建,所述服务器位于一个广播域中,且采用广播方式向分布式重建系统中的其他节点广播可用资源信息。所有的节点组成本实施例中的分布式重建系统。The server in this embodiment, that is, the cloud device, belongs to a node in the distributed reconstruction system for medical image reconstruction in the medical imaging device. The service (that is, the program) in the node is used to realize the reconstruction of PET-CT data, and the server is located in a In the broadcast domain, the available resource information is broadcast to other nodes in the distributed reconstruction system in a broadcast manner. All nodes constitute the distributed reconstruction system in this embodiment.

结合图3所示,本实施例的服务器可包括:分布式计算处理模块A10和重建模块A20。With reference to FIG. 3 , the server in this embodiment may include: a distributed computing processing module A10 and a reconstruction module A20.

具体地,分布式计算处理模块A10用于与分布式重建系统中其他节点通讯,并周期性查看本地重建任务,以及基于资源分配策略、和当前节点的可用资源信息、广播域内其他节点的可用资源信息,对所述本地重建任务进行数据分割以及创建处于分割后数据的线程、进程等;Specifically, the distributed computing processing module A10 is used to communicate with other nodes in the distributed reconstruction system, and periodically check the local reconstruction tasks, and broadcast the available resources of other nodes in the domain based on the resource allocation strategy, the available resource information of the current node, and the information, perform data segmentation on the local reconstruction task and create threads, processes, etc. in the segmented data;

重建模块A20用于根据分布式重建系统中的分布式服务接口调用分布式计算处理模块确定的至少一个节点的线程、进程等对分割的本地重建任务进行处理以完成重建。The reconstruction module A20 is configured to call the thread, process, etc. of at least one node determined by the distributed computing processing module according to the distributed service interface in the distributed reconstruction system to process the divided local reconstruction task to complete the reconstruction.

需要说明的是,重建模块A20可借鉴现有的云端工作站中的重建模块的功能,通过调用分布式计算处理模块A10创建的进程、线程等对应计算资源的分布式服务的接口,实现本地重建任务的重建。It should be noted that the reconstruction module A20 can learn from the function of the reconstruction module in the existing cloud workstation, and realize the local reconstruction task by calling the process, thread and other interfaces created by the distributed computing processing module A10 corresponding to distributed services of computing resources. reconstruction.

本实施例的资源分配策略为用于根据本地重建任务的处理信息确定能够处理所述本地重建任务的一个以上节点,并根据选择的所述节点和该节点处理效率、该节点的可用资源信息对所述本地重建任务进行分割。The resource allocation strategy in this embodiment is used to determine one or more nodes capable of processing the local reconstruction task according to the processing information of the local reconstruction task, and pair the selected node with the processing efficiency of the node and the available resource information of the node according to the selected node. The local reconstruction task is segmented.

在一种可能的实现方式中,上述的分布式计算处理模块A10可包括:网络通讯单元A11、计算资源管理单元A12、计算资源动态利用单元A13和数据分割与合并单元A14。In a possible implementation manner, the above-mentioned distributed computing processing module A10 may include: a network communication unit A11, a computing resource management unit A12, a computing resource dynamic utilization unit A13, and a data division and merging unit A14.

其中,网络通讯单元A11,用于与分布式重建系统中的其他节点进行通信;Wherein, the network communication unit A11 is used to communicate with other nodes in the distributed reconstruction system;

计算资源管理单元A12,用于周期获取分布式重建系统中的其他节点的可用资源信息;本实施例中的计算资源管理单元A12中维护有多类型的表格,例如,实时更新的各节点的可用资源信息的哈希表、实时更新的各节点的节点处理效率的记录表、每一节点当前处理的数据包的数量的数据包信息汇总表等。在其他实施例中,该计算资源管理单元A12维护的表格可为待重建的任务对应的数据包信息表和节点效率信息表等,本实施例不限定,根据实际需要调整。The computing resource management unit A12 is used to periodically obtain the available resource information of other nodes in the distributed reconstruction system; the computing resource management unit A12 in this embodiment maintains multiple types of tables, for example, the available resources of each node updated in real time Hash table of resource information, real-time updated record table of node processing efficiency of each node, data packet information summary table of the number of data packets currently processed by each node, etc. In other embodiments, the table maintained by the computing resource management unit A12 may be a data packet information table and a node efficiency information table corresponding to the task to be rebuilt, which is not limited in this embodiment, and can be adjusted according to actual needs.

举例来说,数据包信息表包含数据包头中包含的各信息,该表格记录的数据包个数。数据包信息表有多个,每个数据处理节点都对应一个数据包信息表,用于实现分析各节点的处理效率。本实施例的数据包信息表可为链表。链表的特点是在首、尾,删除、增加节点效率极高。适用于这种表格数据实时变化的情况。For example, the data packet information table includes various information contained in the data packet header, and the table records the number of data packets. There are multiple data packet information tables, and each data processing node corresponds to a data packet information table, which is used to analyze the processing efficiency of each node. The data packet information table in this embodiment may be a linked list. The characteristic of the linked list is that it is extremely efficient to delete and add nodes at the head and tail. It is suitable for situations where the tabular data changes in real time.

节点效率信息表包含但不限于这些信息:节点名、节点的广播编号、节点在过去一段时间的数据处理时间、节点在过去一段时间所处理的数据片段尺寸、节点的最高数据处理速度、节点的最低数据处理速度。此处的“过去一段时间”指,以当前时间为终点,向前推算的一段时间,可依据实际情况设定,一般为60秒。The node efficiency information table includes but is not limited to the following information: node name, node broadcast number, data processing time of the node in the past period, data fragment size processed by the node in the past period, maximum data processing speed of the node, Minimum data processing speed. The "past period of time" here refers to a period of time calculated forward with the current time as the end point, which can be set according to the actual situation, generally 60 seconds.

计算资源管理单元A12基于内部维护的表格中的信息可计算出对应时间段内相应节点的处理效率相关信息,并将这些信息实时更新至节点效率信息表中。The computing resource management unit A12 can calculate the processing efficiency related information of the corresponding node in the corresponding time period based on the information in the internally maintained table, and update the information to the node efficiency information table in real time.

在一种可能的实现方式中,上述节点处理效率还可以是计算资源管理单元根据下述公式(1)和公式(2)进行计算并实时更新的。In a possible implementation manner, the above-mentioned node processing efficiency may also be calculated by the computing resource management unit according to the following formula (1) and formula (2) and updated in real time.

本实施例中每一节点的节点处理效率包括:该节点对一个数据包的平均处理时间和平均处理速度;The node processing efficiency of each node in this embodiment includes: the average processing time and average processing speed of the node for one data packet;

公式(1)为平均处理时间,Formula (1) is the average processing time,

Figure BDA0003591701480000131
Figure BDA0003591701480000131

公式(2)为平均处理速度;Formula (2) is the average processing speed;

Figure BDA0003591701480000132
Figure BDA0003591701480000132

A为节点表示,n为处理的数据包的总量,Tk为第k个数据包的处理时间;Dk为第k个数据包的字节数。A is the node representation, n is the total number of data packets processed, T k is the processing time of the k-th data packet; D k is the number of bytes of the k-th data packet.

计算资源动态利用单元A13,用于基于资源分配策略、和当前节点的可用资源信息、广播域内其他节点的可用资源信息确定能够处理所述本地重建任务的一个以上节点;A computing resource dynamic utilization unit A13, configured to determine one or more nodes capable of processing the local reconstruction task based on the resource allocation strategy, available resource information of the current node, and available resource information of other nodes in the broadcast domain;

数据分割与合并单元A14,用于根据选择的所述节点和该节点处理效率、该节点的可用资源信息对所述本地重建任务进行分割,并分发到选定的节点中进行处理。The data dividing and merging unit A14 is configured to divide the local reconstruction task according to the selected node, the processing efficiency of the node, and the available resource information of the node, and distribute it to the selected node for processing.

实施例三Embodiment 3

为更好的理解实施例二中分布式计算处理模块A10中各单元的功能,结合图3和图4对各单元进行详细说明。In order to better understand the functions of each unit in the distributed computing processing module A10 in the second embodiment, each unit will be described in detail with reference to FIG. 3 and FIG. 4 .

网络通讯单元A11,用于负责节点之间的通讯。由于所有节点均处在云端,节点之间可构成虚拟子网。虚拟子网正常工作时,可使用SOCKET通讯。当虚拟子网出现故障时,可自动切换至HTTPSOCKET,即网络通讯单元A11支持基于HTTPSOCKET和SOCKET两种通讯方式。本实施例的通讯方式可保证网络通讯单元的高可用性、可扩展性。The network communication unit A11 is used for communication between nodes. Since all nodes are in the cloud, virtual subnets can be formed between nodes. When the virtual subnet works normally, SOCKET communication can be used. When the virtual subnet fails, it can automatically switch to HTTPSOCKET, that is, the network communication unit A11 supports two communication methods based on HTTPSOCKET and SOCKET. The communication method of this embodiment can ensure high availability and expandability of the network communication unit.

计算资源管理单元A12,用于管理分布式重建系统的所有计算节点的可用的计算资源信息(即可用资源信息)。The computing resource management unit A12 is configured to manage available computing resource information (that is, available resource information) of all computing nodes of the distributed reconstruction system.

在实际应用中,可用的计算资源信息包括下述的一项或多项:节点标识、节点广播编号、节点IP、节点服务端口号、该节点的CPU信息、该节点的响应速度、支持的算法版本等。In practical applications, the available computing resource information includes one or more of the following: node identification, node broadcast number, node IP, node service port number, CPU information of the node, response speed of the node, supported algorithms version etc.

节点的CPU信息包括下述的一项或多项:CPU总内核数、当前可用内核数;该节点的内存信息等;The CPU information of the node includes one or more of the following: the total number of CPU cores, the number of currently available cores; the memory information of the node, etc.;

节点的内存信息包括下述的一项或多项:总内存大小、当前空闲内存大小等。The memory information of the node includes one or more of the following: total memory size, current free memory size, etc.

计算资源管理单元A12为每个节点维护一个包含上述可用的计算资源信息的表格/数据结构/哈希表,以便分布式计算处理模块A10使用。The computing resource management unit A12 maintains a table/data structure/hash table containing the above-mentioned available computing resource information for each node, so that the distributed computing processing module A10 can use it.

通常,计算资源管理单元A12动态维护上述表格,例如,以固定时间间隔(通常在1秒以下)刷新,保证维护的是各节点的实时状态。在具体处理中上述表格可以使用哈希表结构实现,以提高元素访问和修改速度。Usually, the computing resource management unit A12 dynamically maintains the above table, for example, refreshes it at a fixed time interval (usually less than 1 second) to ensure that the real-time status of each node is maintained. In specific processing, the above table can be implemented using a hash table structure to improve the speed of element access and modification.

数据分割与合并单元A14用于负责将本地重建任务所属的数据分割,之后分发至调用并处理分割后数据的节点,以使该节点进行处理。数据分割与合并单元A14在将数据分割后,将分割好的数据形成一个数据包,数据包的结构分为两部分:数据包头(PackageHead)和数据(PackageData)。数据包头中包含但不限于以下信息:数据尺寸(以字节为单位,数据长度是多少字节)、数据索引(该数据在整个后重建原始数据中的位置)、发出节点、接收节点、数据发出时间(发出节点)、数据回收时间(发出节点)、数据接收时间(接收节点)、数据发出时间(接收节点)、数据处理耗时(发送时为空,由接收并处理数据的节点填充)。在数据包头之后,是数据,如图4所示。The data splitting and merging unit A14 is responsible for splitting the data to which the local reconstruction task belongs, and then distributes it to the node that calls and processes the split data, so that the node can process it. After dividing the data, the data dividing and merging unit A14 forms a data packet with the divided data, and the structure of the data packet is divided into two parts: the data packet header (PackageHead) and the data (PackageData). The data packet header contains but is not limited to the following information: data size (in bytes, how many bytes are the data length), data index (the position of the data in the original data after the entire reconstruction), sending node, receiving node, data Sending time (sending node), data recovery time (sending node), data receiving time (receiving node), data sending time (receiving node), data processing time (empty when sending, filled by the node that receives and processes the data) . After the packet header, is the data, as shown in Figure 4.

本实施例中,分布式计算处理模块A10会根据不同节点的处理效率,适当微调分配至该节点的数据量以及数据包尺寸,以提高整体处理效率。In this embodiment, the distributed computing processing module A10 will appropriately fine-tune the amount of data and the size of data packets allocated to the node according to the processing efficiency of different nodes, so as to improve the overall processing efficiency.

实施例四Embodiment 4

针对上述各实施例描述的分布式重建系统,以下结合步骤01至步骤07对分布式重建系统的使用过程进行详细说明。With respect to the distributed reconstruction system described in the above embodiments, the following describes the use process of the distributed reconstruction system in detail with reference to step 01 to step 07 .

步骤01:分布式重建系统中每一个节点以指定频率,将本节点的计算资源信息和/或待重建的数据,以广播的形式向广播域中发送。Step 01: Each node in the distributed reconstruction system transmits the computing resource information and/or data to be reconstructed of the node to the broadcast domain in the form of broadcast at a specified frequency.

同时,每一节点还用于接收分布式重建系统中其他节点发出的,具有同样信息的广播包。At the same time, each node is also used to receive broadcast packets with the same information sent by other nodes in the distributed reconstruction system.

各节点读取广播包中的信息,并更新本地维护的对应的信息表(如上的哈希表)。Each node reads the information in the broadcast packet, and updates the corresponding information table (such as the above hash table) maintained locally.

步骤02:若存在一节点内的分布式计算处理模块A10被调用的信息(即分布式重建系统中的一节点接收到该节点的本地重建任务时),则获取分布式计算处理模块A10可以利用的计算资源(即可用资源信息)。确定当前在本地节点可以利用的计算资源,以及广播域中其他节点的可以利用的计算资源。例如接收到本地重建任务的第一节点的分布式计算处理模块A10根据预先维护的哈希表确定本地节点可以利用的计算资源,以及广播域中其他节点的可以利用的计算资源,以及Step 02: If there is information that the distributed computing processing module A10 in a node is called (that is, when a node in the distributed reconstruction system receives the local reconstruction task of the node), obtain the information that the distributed computing processing module A10 can use. computing resources (that is, available resource information). Determine the computing resources currently available on the local node and available computing resources on other nodes in the broadcast domain. For example, the distributed computing processing module A10 of the first node receiving the local reconstruction task determines the computing resources available to the local node and the computing resources available to other nodes in the broadcast domain according to the pre-maintained hash table, and

该第一节点的分布式计算处理模块A10依据本地重建并发限制和广播域内重建并发限制确定实际使用的本地计算资源和广播域中的计算资源。该处的广播域即为分布式重建系统。The distributed computing processing module A10 of the first node determines the actually used local computing resources and computing resources in the broadcast domain according to the local reconstruction concurrency limit and the reconstruction concurrency limit in the broadcast domain. The broadcast domain here is the distributed reconstruction system.

步骤03:该第一节点的分布式计算处理模块A10创建包含多个线程的重建处理进程。Step 03: The distributed computing processing module A10 of the first node creates a reconstruction processing process including multiple threads.

本实施例中,创建进程时,将本地重建任务和可以利用的计算资源的信息交给数据分割与合并单元A14,通过数据分割与合并单元A14确定分配给每个节点的数据大小;In this embodiment, when creating a process, the local reconstruction task and the information of available computing resources are handed over to the data splitting and merging unit A14, and the data size allocated to each node is determined by the data splitting and merging unit A14;

接着,之后将分割的数据交给各线程处理。这些线程,有些运行在本地节点,有些运行在广播域中的其他节点上。本实施例中基于广播方式,第一节点的分布式计算处理模块A10能够实时获取各节点的重建进度信息。Next, the divided data is then handed over to each thread for processing. Some of these threads run on the local node and some on other nodes in the broadcast domain. In this embodiment, based on the broadcasting method, the distributed computing processing module A10 of the first node can acquire the reconstruction progress information of each node in real time.

如本节点在开始本地新的重建任务时刻,正在向其它节点提供计算资源服务,且该计算尚未完成,则暂时只利用本地节点空闲的计算资源。If the local node is providing computing resource services to other nodes at the time of starting a new local reconstruction task, and the computation has not been completed, only the idle computing resources of the local node are temporarily used.

如在重建过程中,有其他节点占用了本地节点的计算资源,则在该远程线程结束后,回收该线程占用的计算资源供本地重建使用。其他节点通过广播包获知本节点能提供的计算资源逐渐减少。If other nodes occupy the computing resources of the local node during the reconstruction process, after the remote thread ends, the computing resources occupied by the thread are recovered for local reconstruction use. Other nodes learn through broadcast packets that the computing resources provided by this node are gradually decreasing.

如在重建过程中,通过其他节点的广播包,获知其他节点开始了本地重建任务,则逐步减少分配在该节点上的线程数。For example, during the reconstruction process, it is learned that other nodes have started local reconstruction tasks through broadcast packets of other nodes, and the number of threads allocated to this node is gradually reduced.

当其他节点开始了在其本地的重建任务后,会逐渐回收计算资源共其本地使用,因此提供给其他节点共享的计算资源会逐渐减少。所以,本节点会依据此信息自动减少分配在该节点上的线程数。When other nodes start their local reconstruction tasks, they will gradually recycle computing resources for their local use, so the computing resources shared by other nodes will gradually decrease. Therefore, the node will automatically reduce the number of threads allocated to the node based on this information.

上述逻辑被规划在一个时间片内执行,该时间片结束后,分布式计算处理模块A10会依据当时的本地节点和广播域中节点的计算资源情况,规划下一个时间片重建线程的分配。时间片的长度可依据具体情况设定,通常在500毫秒到5000毫秒之间。The above logic is planned to be executed in a time slice. After the time slice ends, the distributed computing processing module A10 will plan the allocation of reconstruction threads for the next time slice according to the computing resources of the local node and the nodes in the broadcast domain at that time. The length of the time slice can be set according to specific conditions, usually between 500 milliseconds and 5000 milliseconds.

上述各个实施例的侧重点不同,其相互融合不存在矛盾之处,在实际应用中根据需要选择和设置。The emphases of the above-mentioned embodiments are different, and there is no contradiction in their integration with each other, and can be selected and set as required in practical applications.

应当注意的是,在权利要求中,不应将位于括号之间的任何附图标记理解成对权利要求的限制。词语“包含”不排除存在未列在权利要求中的部件或步骤。位于部件之前的词语“一”或“一个”不排除存在多个这样的部件。本发明可以借助于包括有若干不同部件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的权利要求中,这些装置中的若干个可以是通过同一个硬件来具体体现。词语第一、第二、第三等的使用,仅是为了表述方便,而不表示任何顺序。可将这些词语理解为部件名称的一部分。It should be noted that, in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not preclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several different components and by means of a suitably programmed computer. In the claims enumerating several means, several of these means can be embodied by one and the same item of hardware. The words first, second, third, etc. are used for convenience only and do not imply any order. These words can be understood as part of the part name.

此外,需要说明的是,在本说明书的描述中,术语“一个实施例”、“一些实施例”、“实施例”、“示例”、“具体示例”或“一些示例”等的描述,是指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In addition, it should be noted that in the description of this specification, the description of the terms "one embodiment", "some embodiments", "embodiments", "examples", "specific examples" or "some examples", etc., are Indicates that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, those skilled in the art may combine and combine the different embodiments or examples described in this specification, as well as the features of the different embodiments or examples, without conflicting each other.

尽管已描述了本发明的优选实施例,但本领域的技术人员在得知了基本创造性概念后,则可对这些实施例作出另外的变更和修改。所以,权利要求应该解释为包括优选实施例以及落入本发明范围的所有变更和修改。Although the preferred embodiments of the present invention have been described, additional changes and modifications to these embodiments will occur to those skilled in the art after learning the basic inventive concepts. Therefore, the claims should be construed to include the preferred embodiment and all changes and modifications that fall within the scope of the present invention.

显然,本领域的技术人员可以对本发明进行各种修改和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也应该包含这些修改和变型在内。It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit and scope of the invention. Thus, provided that these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention should also include these modifications and variations.

Claims (9)

1. A distributed reconstruction resource management method based on cloud equipment is characterized in that the distributed reconstruction resource management method is applied to a distributed reconstruction system for medical image reconstruction in medical image equipment, a plurality of nodes for realizing a data reconstruction program are configured in the distributed reconstruction system, all the nodes are located in a broadcast domain of the distributed reconstruction system, and available resource information is periodically broadcast in a broadcast mode, and the method comprises the following steps:
s10, the first node periodically acquires the available resource information in the first node and broadcasts the available resource information to other nodes in the broadcast domain, and receives the available resource information broadcast by each node in the broadcast domain;
s20, the first node judges whether a local reconstruction task sent by the video equipment connected with the first node is received;
s30, if the first node receives the local reconstruction task, according to the resource allocation strategy, the available resource information of the current first node and the available resource information of other nodes in the broadcast domain, performing data segmentation on the local reconstruction task and creating a thread of segmented data to complete the local reconstruction task in a distributed manner;
the resource allocation strategy is used for determining more than one node capable of processing the local reconstruction task according to the processing information of the local reconstruction task, and segmenting the local reconstruction task according to the selected node, the processing efficiency of the node and the available resource information of the node.
2. The method of claim 1, further comprising:
after the data processing distributed by all the nodes processed by the local reconstruction task is finished, the available resource information in the node is updated and broadcasted;
when a first node determines more than one node capable of processing the local reconstruction task, the priority of the first node is the highest priority.
3. The method of claim 1, wherein each node maintains a data structure for available resource information of each node in the broadcast domain and a node efficiency information table of each node;
the information in the node efficiency information table of each node stored in the first node is the processing efficiency of each node calculated by the first node based on the data packet processed by each node in the specified time.
4. The method of claim 1, wherein communication between any two nodes is performed in a SOCKET communication manner;
the available resource information includes one or more of:
the total number of cores of the CPU and the number of currently available cores; memory information of the node, idle memory information, response speed and supported algorithm.
5. A server belonging to a node of a distributed reconstruction system for reconstructing medical images in medical imaging equipment, the server being located in a broadcast domain and broadcasting available resource information to other nodes in the distributed reconstruction system in a broadcast manner, the server comprising:
the distributed computing processing module is used for communicating with other nodes in the distributed reconstruction system, receiving a local reconstruction task transmitted by the image equipment connected with the server, and performing data segmentation on the local reconstruction task and creating a thread and/or a process of segmented data based on a resource allocation strategy, available resource information of the current node and available resource information of other nodes in a broadcast domain;
the reconstruction module is used for calling the thread and/or process of at least one node determined by the distributed computing processing module according to a distributed service interface in the distributed reconstruction system to process the segmented local reconstruction task so as to complete reconstruction;
the resource allocation strategy is used for determining more than one node capable of processing the local reconstruction task according to the processing information of the local reconstruction task, and segmenting the local reconstruction task according to the selected node, the processing efficiency of the node and the available resource information of the node.
6. The server according to claim 5, wherein the distributed computing processing module comprises:
the network communication unit is used for communicating with other nodes in the distributed reconstruction system;
the system comprises a calculation resource management unit, a broadcast domain management unit and a server, wherein the calculation resource management unit is used for periodically acquiring available resource information of other nodes in the distributed reconstruction system and maintaining a data structure of the available resource information of each node in the broadcast domain stored in the server and a node efficiency information table of each node;
the computing resource dynamic utilization unit is used for determining more than one node capable of processing the local reconstruction task based on a resource allocation strategy, available resource information of the current node and available resource information of other nodes in a broadcast domain;
and the data segmentation and combination unit is used for segmenting the local reconstruction task according to all the selected nodes, the processing efficiency of each node in all the nodes and the available resource information, and distributing the segmented local reconstruction task to the selected nodes for reconstruction processing.
7. The server according to claim 6, wherein the network communication unit is configured to communicate in a SOCKET communication manner; or, an HTTPSOCKET communication mode is adopted for communication;
the computing resource management unit is used for sending the available resource information and the reconstruction task information of the node to the broadcast domain in a broadcast mode; receiving broadcast packets sent by other nodes in the broadcast domain;
the reconstruction task information comprises information of a local reconstruction task or information of a reconstruction task distributed by other nodes;
the computing resource dynamic utilization unit is used for acquiring available resource information which can be used by the node after receiving the local reconstruction task and determining a reconstruction algorithm; the available computing resource information of other nodes in the broadcast domain, the available resource information of the local server and the available resource information of other nodes in the broadcast domain are determined according to the local reconstruction concurrency limit and the reconstruction concurrency limit in the broadcast domain, all nodes for processing the local reconstruction task are selected, and a plurality of threads and/or processes for processing the local reconstruction task are created;
the data dividing and merging unit is used for receiving the thread and/or the process sent by the dynamic computing resource utilization unit, the available resource information of all the selected nodes and the processing efficiency of each node in all the selected nodes, dividing the local reconstruction task and determining the size of data distributed to each selected node for distribution;
the threads and/or processes that process the partitioned data run partly at the local node and partly at other nodes in the broadcast domain.
8. The server of claim 7, further comprising:
the calculation resource dynamic utilization unit creates a maximum number of threads for processing the local reconstruction task based on the available resource information of all the selected nodes.
9. A distributed reconstruction system based on cloud equipment, comprising a plurality of servers according to any one of claims 5 to 8, wherein each server is used as a node to form a decentralized broadcast domain, and each server corresponds to more than one imaging device located in a hospital.
CN202210379557.5A 2022-04-12 2022-04-12 A distributed reconstruction resource management method based on cloud devices Active CN114721828B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210379557.5A CN114721828B (en) 2022-04-12 2022-04-12 A distributed reconstruction resource management method based on cloud devices

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210379557.5A CN114721828B (en) 2022-04-12 2022-04-12 A distributed reconstruction resource management method based on cloud devices

Publications (2)

Publication Number Publication Date
CN114721828A true CN114721828A (en) 2022-07-08
CN114721828B CN114721828B (en) 2025-07-11

Family

ID=82244466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210379557.5A Active CN114721828B (en) 2022-04-12 2022-04-12 A distributed reconstruction resource management method based on cloud devices

Country Status (1)

Country Link
CN (1) CN114721828B (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101741912A (en) * 2009-12-30 2010-06-16 中兴通讯股份有限公司 Method, network apparatus and distributed network system for processing computation task
CN103336719A (en) * 2013-06-04 2013-10-02 江苏科技大学 Distribution rendering system and method in P2P mode
CN105359490A (en) * 2013-03-18 2016-02-24 皇家Kpn公司 User authentication in a cloud environment
US20160188594A1 (en) * 2014-12-31 2016-06-30 Cloudera, Inc. Resource management in a distributed computing environment
CN105912399A (en) * 2016-04-05 2016-08-31 杭州嘉楠耘智信息科技有限公司 Task processing method, device and system
CN107025136A (en) * 2016-01-29 2017-08-08 中兴通讯股份有限公司 A kind of decentralization resource regulating method and system
CN109962947A (en) * 2017-12-22 2019-07-02 阿里巴巴集团控股有限公司 Method for allocating tasks and device in a kind of peer-to-peer network
CN109992387A (en) * 2019-04-01 2019-07-09 北京邮电大学 A task processing method, device and electronic device for terminal coordination
CN110659110A (en) * 2018-06-28 2020-01-07 厦门本能管家科技有限公司 Block chain based distributed computing method and system
CN110728363A (en) * 2018-06-29 2020-01-24 华为技术有限公司 Task processing method and device
CN111309491A (en) * 2020-05-14 2020-06-19 北京并行科技股份有限公司 Operation cooperative processing method and system
CN111381948A (en) * 2020-02-04 2020-07-07 北京贝思科技术有限公司 Distributed computing task processing method and equipment and electronic equipment
CN112882827A (en) * 2019-11-29 2021-06-01 伊姆西Ip控股有限责任公司 Method, electronic device and computer program product for load balancing
CN113742075A (en) * 2021-09-07 2021-12-03 北京百度网讯科技有限公司 Task processing method, device and system based on cloud distributed system
CN114035945A (en) * 2021-10-29 2022-02-11 深圳市晨北科技有限公司 Computing power resource allocation method, device, equipment and storage medium

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101741912A (en) * 2009-12-30 2010-06-16 中兴通讯股份有限公司 Method, network apparatus and distributed network system for processing computation task
CN105359490A (en) * 2013-03-18 2016-02-24 皇家Kpn公司 User authentication in a cloud environment
CN103336719A (en) * 2013-06-04 2013-10-02 江苏科技大学 Distribution rendering system and method in P2P mode
US20160188594A1 (en) * 2014-12-31 2016-06-30 Cloudera, Inc. Resource management in a distributed computing environment
CN107025136A (en) * 2016-01-29 2017-08-08 中兴通讯股份有限公司 A kind of decentralization resource regulating method and system
CN105912399A (en) * 2016-04-05 2016-08-31 杭州嘉楠耘智信息科技有限公司 Task processing method, device and system
CN109962947A (en) * 2017-12-22 2019-07-02 阿里巴巴集团控股有限公司 Method for allocating tasks and device in a kind of peer-to-peer network
CN110659110A (en) * 2018-06-28 2020-01-07 厦门本能管家科技有限公司 Block chain based distributed computing method and system
CN110728363A (en) * 2018-06-29 2020-01-24 华为技术有限公司 Task processing method and device
CN109992387A (en) * 2019-04-01 2019-07-09 北京邮电大学 A task processing method, device and electronic device for terminal coordination
CN112882827A (en) * 2019-11-29 2021-06-01 伊姆西Ip控股有限责任公司 Method, electronic device and computer program product for load balancing
CN111381948A (en) * 2020-02-04 2020-07-07 北京贝思科技术有限公司 Distributed computing task processing method and equipment and electronic equipment
CN111309491A (en) * 2020-05-14 2020-06-19 北京并行科技股份有限公司 Operation cooperative processing method and system
CN113742075A (en) * 2021-09-07 2021-12-03 北京百度网讯科技有限公司 Task processing method, device and system based on cloud distributed system
CN114035945A (en) * 2021-10-29 2022-02-11 深圳市晨北科技有限公司 Computing power resource allocation method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
梁毅;侯颖;陈诚;金翊;: "面向大数据流式计算的任务管理技术综述", 计算机工程与科学, no. 02, 15 February 2017 (2017-02-15) *

Also Published As

Publication number Publication date
CN114721828B (en) 2025-07-11

Similar Documents

Publication Publication Date Title
US20220014434A1 (en) Slice Resource Deployment Method and Apparatus, and Slice Manager and Computer Storage Medium
WO2020052322A1 (en) Data processing method, device and computing node
CN110069341B (en) Method for scheduling tasks with dependency relationship configured according to needs by combining functions in edge computing
US10686728B2 (en) Systems and methods for allocating computing resources in distributed computing
CN112667594A (en) Heterogeneous computing platform based on hybrid cloud resources and model training method
JP2002533809A (en) Object hashing with progressive changes
CN111078404B (en) Computing resource determining method and device, electronic equipment and medium
WO2022267646A1 (en) Pod deployment method and apparatus
CN117850968B (en) A method, device and system for specifying NUMA nodes to implement virtual machine migration
WO2021210123A1 (en) Scheduling method, scheduler, gpu cluster system, and program
WO2021022947A1 (en) Method for deploying virtual machine and related device
CN114924888A (en) Resource allocation method, data processing method, device, equipment and storage medium
CN114721828A (en) Distributed reconstruction resource management method based on cloud equipment
WO2024183352A1 (en) Method and system for user data management in 5g network by udr
CN116112499B (en) Construction method of data acquisition system and data acquisition method
CN115866059B (en) Block chain link point scheduling method and device
CN115514777B (en) Resource management method and device, electronic device and storage medium
WO2022110944A1 (en) Network slice planning method and related device
CN116467065A (en) Algorithm model training method and device, electronic equipment and storage medium
CN117971498B (en) Scheduling method for GPU resources in computing cluster, electronic equipment and storage medium
CN110955522A (en) Resource management method and system for coordination performance isolation and data recovery optimization
US7515553B2 (en) Group synchronization by subgroups
CN119025266B (en) A task processing method and device based on YARN
CN108172268A (en) A kind of batch data processing method, device, terminal and storage medium
CN119201454A (en) Method, device, equipment and storage medium for selecting computing nodes in resource scheduling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant