CN101018192A

CN101018192A - Grid workflow virtual service scheduling method based on the open grid service architecture

Info

Publication number: CN101018192A
Application number: CNA2006101652474A
Authority: CN
Inventors: 都志辉; 王曼; 程志力; 朱穗晖
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2006-12-15
Filing date: 2006-12-15
Publication date: 2007-08-15
Anticipated expiration: 2026-12-15
Also published as: CN100508501C

Abstract

The grid workflow virtual service scheduling method based on the open grid service architecture belongs to the workflow scheduling technology method under the grid. The workflow virtual service PVS composed of the scheduling scheme generation module uses a large number of idle resources in the grid environment to continuously find better scheduling strategies for the defined PVS, update and store them in the scheduling scheme candidate pool of PVS; and then use the scheduling scheme When the specific task instance arrives, the scheme execution module selects the current better scheduling scheme from the selected pool of PVS scheduling schemes for job scheduling. Only when no suitable scheduling scheme can be found, it communicates with the scheduling scheme generation module. Wait for the arrival of the new scheduling scheme within the specified time. The invention has the advantages of reducing repeated scheduling, accelerating resource reservation, reducing job completion time, increasing job throughput, and improving resource utilization.

Description

Grid Workflow Virtual Service Scheduling Method Based on Open Grid Service Architecture

技术领域technical field

本发明涉及一种网格环境下的工作流调度方法，具体是通过预调度、QoS保障以及不断选优机制来进行网格任务调度。The invention relates to a workflow scheduling method in a grid environment, in particular to grid task scheduling through pre-scheduling, QoS guarantee and continuous optimization mechanism.

背景技术Background technique

随着网格技术的进步和需求的增加，网格工作流(Grid Workflow)成为网格中一类重要的应用。一个网格工作流应用可以看成是由一组工作段(网格中的服务)组成，并且这些服务需要按照事先预订好的顺序执行，从而完成一个复杂的目标。一般来说，一个工作流是包含一组工作段的抽象形式，而采用何种算法将服务和网格资源匹配起来得到一条可执行的工作流实例并投入运行是当前研究的一个重要问题，称为网格工作流的调度。在进行调度算法设计时，为了提高性能，往往需要和具体网格工作流应用相结合。在不同的应用环境下(比如同构/异构，数据密集型应用/计算密集型应用等)，不同工作流调度算法的性能会很不相同。由于网格资源的动态性和复杂性，网格工作流调度一般采用动态调度的方法。在调度时，通常希望选择最优或者近似最优的调度方案。将工作流的工作段和网格资源进行匹配找到最优的调度方案，是NP-hard问题，所以在调度中一般采用启发式调度算法：如Min_Min heuristic，Max_min heuristic，sufferage heuristic等算法。但是启发式算法一般情况下只能找到较好解，有时也会和最有解相差很远。With the progress of grid technology and the increase of demand, grid workflow (Grid Workflow) has become an important class of grid applications. A grid workflow application can be regarded as composed of a group of work segments (services in the grid), and these services need to be executed in a pre-booked order to complete a complex goal. Generally speaking, a workflow is an abstract form that includes a group of work segments, and what algorithm to use to match services and grid resources to obtain an executable workflow instance and put it into operation is an important issue in current research, called Scheduling for grid workflows. When designing a scheduling algorithm, in order to improve performance, it often needs to be combined with specific grid workflow applications. In different application environments (such as homogeneous/heterogeneous, data-intensive applications/computing-intensive applications, etc.), the performance of different workflow scheduling algorithms will be very different. Due to the dynamics and complexity of grid resources, grid workflow scheduling generally adopts a dynamic scheduling method. When scheduling, it is usually desirable to select an optimal or nearly optimal scheduling scheme. It is an NP-hard problem to match the work segments of the workflow with the grid resources to find the optimal scheduling scheme. Therefore, heuristic scheduling algorithms are generally used in scheduling: such as Min_Min heuristic, Max_min heuristic, and suffering heuristic algorithms. However, the heuristic algorithm can only find a better solution under normal circumstances, and sometimes it is far from the best solution.

除此之外，当前所用的网格工作流调度方法都有一个共同假设，即对于同一个工作流的每个执行实例的调度都是独立的。但在很多实际应用中，由于工作流结构的相同，在一定时间范围内的多个执行实例有时可以共享同一种调度方案，这样就不需要每次都对一个执行实例进行一次调度算法调用。如果对于一个工作流而言，对其调度的时间占整个任务完成时间的比例很大，那么按照当前调度模式，对于多个执行实例，大量时间浪费在了重复调度上，仅靠对调度算法的改进，性能提高也是有限。In addition, the currently used grid workflow scheduling methods all have a common assumption, that is, the scheduling of each execution instance of the same workflow is independent. However, in many practical applications, due to the same workflow structure, multiple execution instances within a certain time range can sometimes share the same scheduling scheme, so that there is no need to call a scheduling algorithm for an execution instance every time. If for a workflow, the scheduling time accounts for a large proportion of the entire task completion time, then according to the current scheduling mode, for multiple execution instances, a lot of time is wasted on repeated scheduling, only relying on the scheduling algorithm Improvement, performance improvement is also limited.

将网格QoS引入调度方法中去也是当前需要考虑的问题之一。当前的网格流水线调度中不少也引入了QoS保障机制，但是经常只从单个方面进行考虑，并且没有相对准确和灵活的根据不同应用选择QoS保障的机制。Introducing grid QoS into the scheduling method is also one of the current issues that need to be considered. Quite a few of the current grid pipeline scheduling also introduce QoS guarantee mechanisms, but they are often only considered from a single aspect, and there is no relatively accurate and flexible mechanism for selecting QoS guarantees according to different applications.

发明内容Contents of the invention

本发明的目的在于提出一种基于开放网格服务体系结构(OGSA)的网格工作流的调度方法，在于解决对于同一工作流的不同实例的重复调度问题，同时也兼顾了完成时间、容错性、安全性三个方面的QoS保障。The purpose of the present invention is to propose a grid workflow scheduling method based on the Open Grid Service Architecture (OGSA), to solve the problem of repeated scheduling for different instances of the same workflow, while also taking into account the completion time and fault tolerance , QoS guarantee in three aspects of security.

本发明的特征在于，依次会有以下步骤：The present invention is characterized in that following steps are arranged in sequence:

(1)初始化：在所述调度方案产生器和执行器中分别建立一个调度方案产生模块和调度方案执行模块。(1) Initialization: a scheduling scheme generation module and a scheduling scheme execution module are respectively established in the scheduling scheme generator and the executor.

(2)调度方案产生模块，判断是否有从调度执行模块发来的请求信号：若没有，该调度方案产生模块循环地为系统设定的所有PVS产生相应的调度方案。(2) The dispatching plan generation module judges whether there is a request signal sent from the dispatching execution module: if not, the dispatching plan generation module cyclically generates corresponding dispatching plans for all PVSs set by the system.

(3)调度方案执行器按照以下步骤根据收到的用户PVS执行命令选择具体调度方案并预约和执行该调度方案。(3) The scheduling scheme executor selects a specific scheduling scheme according to the received user PVS execution command according to the following steps, and reserves and executes the scheduling scheme.

总体来说，该调度方法将一个PVS可能的调度方案作为该PVS的共有信息共享，引入QoS保障，预调度，资源预约等机制，从而减少重复调度，加快资源预约，降低作业完成时间，增加作业吞吐量和提高资源利用率。In general, the scheduling method uses a possible scheduling scheme of PVS as the common information sharing of the PVS, introduces QoS guarantee, pre-scheduling, resource reservation and other mechanisms, thereby reducing repeated scheduling, speeding up resource reservation, reducing job completion time, and increasing job throughput and improve resource utilization.

在附图4和附图5中显示了本发明中的调度方案的性能优势。其中对比调度方案采用了当前较为通用并且效率较好的启发式调度算法Min_Min算法。从图4和图5中可以看出，如果调度时间占实际执行时间的比例越大，按照本发明中的方案进行调度的性能越好。The performance advantages of the scheduling scheme in the present invention are shown in Fig. 4 and Fig. 5 . Among them, the comparative scheduling scheme adopts the Min_Min algorithm, which is currently more general and more efficient heuristic scheduling algorithm. It can be seen from FIG. 4 and FIG. 5 that if the ratio of the scheduling time to the actual execution time is larger, the performance of scheduling according to the scheme of the present invention is better.

附图说明Description of drawings

图1为本发明的总体PVS调度方法流程图：(a)调度方案产生模块执行流程图；(b)调度方案执行模块执行流程图。Fig. 1 is the flow chart of the overall PVS scheduling method of the present invention: (a) execution flow chart of dispatching plan generation module; (b) execution flow chart of dispatching plan execution module.

图2为本发明的调度模块结构图：输入参数为：(PVS名称，数据源)。Fig. 2 is a structural diagram of the scheduling module of the present invention: the input parameters are: (PVS name, data source).

图3为本发明中调度执行模块方法图。Fig. 3 is a method diagram of the scheduling execution module in the present invention.

图4为本发明中调度方法和对比调度方法相比较时间节省率图。Fig. 4 is a time saving rate diagram comparing the scheduling method in the present invention and the comparative scheduling method.

图5为本发明中调度方法和对比调度方法的资源利用率比较图。FIG. 5 is a comparison chart of resource utilization between the scheduling method of the present invention and the comparative scheduling method.

图6为本发明中的具体实例天文数据处理中的快视分析PVSFig. 6 is the quick view analysis PVS in the concrete example astronomical data processing in the present invention

具体实施方式Detailed ways

本发明主要是为了解决网格工作流调度中对于同一个工作流(本发明中称为PVS)的不同实例的重复调度问题，同时加入完成时间，容错性，安全性三个方面的QoS保障。将调度方案产生和调度方案执行分离开来，使得调度方案的产生能充分利用网格系统中的闲散资源，在作业到来之前就能够产生调度方案，使得PVS实例到来时尽快投入执行，节省了传统调度方法中的查找匹配时间。同时，每次调度方案的产生是在以往调度方案产生的基础上，如此就能够根据网格环境的变化逐步靠近最优解。The present invention is mainly to solve the repeated scheduling problem of different instances of the same workflow (referred to as PVS in the present invention) in the grid workflow scheduling, and at the same time add QoS guarantees in three aspects of completion time, fault tolerance and security. Separating the generation of the scheduling plan from the execution of the scheduling plan, so that the generation of the scheduling plan can make full use of the idle resources in the grid system, and the scheduling plan can be generated before the job arrives, so that the PVS instance can be put into execution as soon as possible when it arrives, saving the traditional Lookup match time in dispatch method. At the same time, the generation of each scheduling scheme is based on the generation of previous scheduling schemes, so that it can gradually approach the optimal solution according to the change of the grid environment.

在调度方案执行部分，采用了预约机制，即为一个PVS中的所有工作段按照选定的调度方案进行资源的预约(这里“资源”是指网格系统中可以执行该工作段的部署在系统中的具体服务和其所需的硬件资源和软件资源)，这样提高了系统中的资源利用率，减少了作业的等待时间。同时，预约机制按照PVS不同的等级给PVS预约不同能力的资源，减少了阻塞的概率，并进一步提高了资源利用率。In the execution part of the scheduling plan, a reservation mechanism is adopted, that is, to reserve resources for all work segments in a PVS according to the selected scheduling plan (here "resource" refers to the grid system that can execute the deployment of the work segment in the system The specific services in the system and the required hardware and software resources), which improves the resource utilization in the system and reduces the waiting time of the job. At the same time, the reservation mechanism reserves resources of different capabilities for PVS according to different levels of PVS, which reduces the probability of blocking and further improves resource utilization.

该PVS调度方案适用于所有基于开放网格服务体系结构的网格系统中。主要由调度方案产生模块(1)和调度方案执行模块(2)两个模块组成。The PVS scheduling scheme is applicable to all grid systems based on the open grid service architecture. It is mainly composed of two modules: a scheduling scheme generation module (1) and a scheduling scheme execution module (2).

(1)调度方案产生模块(1) Scheduling scheme generation module

●功能：负责为系统中每一个PVS产生相应的调度方案。●Function: Responsible for generating corresponding scheduling schemes for each PVS in the system.

●接口：该模块有主要有两个接口，如附图2所示，一是与调度方案执行模块的接口(10)，二是调用网格中的具体资源的接口(11)。●Interface: This module mainly has two interfaces, as shown in Figure 2, one is the interface (10) with the dispatching scheme execution module, and the other is the interface (11) for invoking specific resources in the grid.

(2)调度方案执行模块(2) Scheduling scheme execution module

●功能：负责具体的实例执行。●Function: responsible for the implementation of specific examples.

●接口：如附图2所示，调度方案执行模块有用户接口(9)，可以接收用户执行PVS的命令；与调度方案产生模块的接口(10)。●Interface: as shown in Figure 2, the dispatching plan execution module has a user interface (9), which can receive the user's command to execute PVS; and the interface (10) with the dispatching plan generation module.

PVS(Pipeline Virtual Service)是该调度方法中一个核心概念。它为一个六元组，具体描述如下：PVS (Pipeline Virtual Service) is a core concept in this scheduling method. It is a six-tuple, and the specific description is as follows:

PVS＝<工作段集合(3)，依赖关系描述(4)，输入参数描述(5)，输出参数描述(6)，PVS权限(7)，调度方案备选池(8)>。PVS=<work segment set (3), dependency description (4), input parameter description (5), output parameter description (6), PVS authority (7), scheduling scheme candidate pool (8)>.

工作段集合(3)，为一组有限工作段的集合。它描述了一个PVS由哪些基本工作段组成。这里的工作段都是抽象的概念，并没有和网格中的具体资源相关联，只有在PVS实例执行时，才被调度到相关资源上具体执行。The working segment set (3) is a set of limited working segments. It describes which basic work segments a PVS consists of. The work segments here are abstract concepts, not associated with specific resources in the grid, and are scheduled to be executed on related resources only when the PVS instance is executed.

依赖关系描述(4)，为一个有向无环图，用于描述(3)中的各个活动的依赖关系。图中每一个节点代表PVS中的一个工作段。有向边代表工作段之间的依赖关系(数据依赖或控制依赖)。若工作段A的执行依赖于工作段B的执行结束，节点Ai有一条指向节点的边。没有入边的节点为源节点，没有出边的节点为目的节点。Dependency description (4), which is a directed acyclic graph, is used to describe the dependency of each activity in (3). Each node in the figure represents a working segment in PVS. Directed edges represent dependencies (data dependencies or control dependencies) between work segments. If the execution of work segment A depends on the completion of the execution of work segment B, node Ai has an edge pointing to the node. A node without an incoming edge is a source node, and a node without an outgoing edge is a destination node.

PVS权限(7)代表PVS的权限。(7)的值越大，该PVS的优先级越高。一般来说，只有权限不低于(7)的用户才能调用该服务(该属性可选)。The PVS authority (7) represents the authority of the PVS. The larger the value of (7), the higher the priority of the PVS. Generally speaking, only users with authority not lower than (7) can call this service (this property is optional).

调度方案备选池(8)为PVS的调度方案集合，包含Num(该数值在算法实现时用户给出)个有序记录。一个调度方案＝<P，makespan，security，performance>。P为一个具体的调度方案。makespan为在此种调度方案下的作业预测完成时间，即从用户提交作业给PVS到得到结果的时间。Security和performance为在这种调度方案下的安全等级和容错等级。一般来说，Security和performance的值越高，该调度方案的这两个方面的性能越好。Makespan、security和performance这三个值由用户具体采用的调度算法给出。The scheduling scheme candidate pool (8) is a PVS scheduling scheme set, including Num (this value is given by the user when the algorithm is implemented) ordered records. A scheduling scheme = <P, makespan, security, performance>. P is a specific scheduling scheme. Makespan predicts the completion time of the job under this scheduling scheme, that is, the time from when the user submits the job to PVS to when the result is obtained. Security and performance are the security level and fault tolerance level under this scheduling scheme. In general, the higher the values of Security and performance, the better the performance of these two aspects of the scheduling scheme. The three values of Makespan, security and performance are given by the specific scheduling algorithm adopted by the user.

输入参数描述(5)，输出参数描述(6)可以为WSDL描述的文件，用于描述该PVS执行时所需的参数信息，也可以为专门负责输入输出的网格服务(Web服务)。The input parameter description (5) and the output parameter description (6) can be a file described by WSDL, used to describe the parameter information required when the PVS is executed, or a grid service (Web service) responsible for input and output.

一个PVS一旦被配置好，就可以作为一个网格上的服务被共享。(3)(4)(5)(6)(7)是配置时决定的，而(8)中的记录随着调用此PVS作业的增多而变化。Once a PVS is configured, it can be shared as a service on the grid. (3)(4)(5)(6)(7) are determined during configuration, and the records in (8) change with the increase of calls to this PVS job.

综上，一个PVS可以看作是对一条工作流的封装，其中，PVS中的(3)和(4)合起来就是网格工作流的基本定义，(5)是对工作流中无前驱节点的输入参数的抽象描述，(6)是对工作流中无后继节点的输出参数的抽象描述，而(8)相当于一个工作流的候选调度方案集合。To sum up, a PVS can be regarded as an encapsulation of a workflow, where (3) and (4) in PVS are the basic definition of a grid workflow, and (5) is the definition of a workflow without a predecessor node (6) is an abstract description of the output parameters of no successor nodes in the workflow, and (8) is equivalent to a set of candidate scheduling schemes for a workflow.

PVS的基本组成如上说述，但是在具体应用时，表述方法可以有所不同。The basic composition of PVS is described above, but in specific applications, the expression method can be different.

PVS调度方法的具体流程图由图1所示。调度方案产生模块和调度方案执行模块式两个相对独立的线程，或者说也可以在两个不同的处理器上执行。The specific flowchart of the PVS scheduling method is shown in FIG. 1 . The scheduling plan generation module and the scheduling plan execution module are two relatively independent threads, or they can also be executed on two different processors.

对图1(a)中流程具体解释如下。在调度方案产生模块中，一旦网格中有空闲的计算资源，则调度方案产生模块就会执行。该部分循环更新已经定义好的PVS的调度方案备选池(8)。具体如下：设该系统有N个PVS，依次为PVS1，PVS2，...，PVS N。对于每一个PVS i(1＜＝i＜＝N)，调度方案产生模块(1)采用已经定义好的资源调度匹配算法，得到一个调度方案。该资源调度匹配算法可以根据具体应用情况用户选择已有的启发式算法或搜索算法。通用情况是用户提供一个算法资源池，根据不同情况系统选取不同的算法。得到一个调度方案后，按照用户定义计算出该调度方案的makespan、security和performance值，和该PVS已有的调度方案比较，如果PVS的(8)中存放的调度方案数小于最大存贮个数，则直接把新调度方案加入，否则，替换出一个比新调度方案差的调度方案。之后对下一个PVS进行处理。如果调度方案执行模块有需求，则给调度方案产生模块发出信号，从而调度方案产生模块在执行完当前PVS处理后产生中断，优先处理调度方案执行模块发出的请求，执行完毕后，从刚才中断处继续执行。The specific explanation of the process in Figure 1(a) is as follows. In the scheduling plan generating module, once there is idle computing resource in the grid, the scheduling plan generating module will be executed. This part cyclically updates the already defined PVS scheduling scheme candidate pool (8). The details are as follows: Suppose the system has N PVS, PVS1, PVS2, ..., PVS N in turn. For each PVS i (1<=i<=N), the scheduling scheme generation module (1) adopts the defined resource scheduling matching algorithm to obtain a scheduling scheme. In this resource scheduling matching algorithm, the user can select an existing heuristic algorithm or a search algorithm according to specific application conditions. The general situation is that the user provides an algorithm resource pool, and the system selects different algorithms according to different situations. After obtaining a scheduling plan, calculate the makespan, security and performance values of the scheduling plan according to the user definition, and compare it with the existing scheduling plan of the PVS, if the number of scheduling plans stored in (8) of the PVS is less than the maximum storage number , then directly add the new scheduling scheme, otherwise, replace a scheduling scheme that is worse than the new scheduling scheme. Then the next PVS is processed. If the scheduling plan execution module has a demand, it will send a signal to the scheduling plan generation module, so that the scheduling plan generation module will generate an interrupt after executing the current PVS processing, and give priority to the request sent by the scheduling plan execution module. Continue to execute.

对图1(b)中的流程具体解释如下。在调度方案执行模块中，一旦有作业到来，也就是有PVS调用请求，则调度方案执行模块从该PVS的(8)中按照makespan、security和performance值依次判断调度方案当前是否能用，也就是该调度方案所涉及到的资源当前是否可用。只要有一个资源不可用(例如，提供该资源的站点坏掉)，则此调度方案就失效，从该PVS(8)中删除。直到找到一个当前可用的调度方案为止。此时产生一个PVS实例，将PVS中各个工作段与调度方案中的资源相匹配，同时进行整体预约。预约完毕后，开始执行该PVS实例。如果当前该PVS的(8)中的所有调度方案均不可用，则此时调度方案执行模块向调度方案产生模块发出中断信号，请求调度方案产生模块尽快为该PVS产生一个可用的调度方案，然后调度方案执行模块等待一个用户指定的时间，重新判断该PVS有没有可用的调度方案。The specific explanation of the flow in Fig. 1(b) is as follows. In the scheduling plan execution module, once a job arrives, that is, there is a PVS call request, the scheduling plan execution module judges whether the scheduling plan is currently available according to the values of makespan, security, and performance in (8) of the PVS, that is, Whether the resources involved in the scheduling scheme are currently available. As long as a resource is unavailable (for example, the site providing the resource is broken), the scheduling scheme becomes invalid and is deleted from the PVS (8). until a currently available schedule is found. At this time, a PVS instance is generated, and each work segment in the PVS is matched with the resources in the scheduling plan, and the overall reservation is made at the same time. After the reservation is completed, start to execute the PVS instance. If all scheduling schemes in (8) of the current PVS are not available, then the scheduling scheme execution module sends an interrupt signal to the scheduling scheme generation module at this time, and the request scheduling scheme generation module produces an available scheduling scheme for this PVS as soon as possible, and then The scheduling scheme execution module waits for a time specified by the user, and re-judges whether the PVS has an available scheduling scheme.

图2给出了采用该PVS调度方法的调度模块示例图。Figure 2 shows an example diagram of a scheduling module using the PVS scheduling method.

在调度方案执行模块，为了适应网格的分布性，采用了两层调度模式，即全局调度和PVS调度两层。如图3所示。其中，PVS调度可以有多个，分布在不同的服务器上。当用户命令提交到全局调度时，全局调度将其分配到合适的PVS调度上，然后该PVS调度进行具体的调度方案执行。In the scheduling plan execution module, in order to adapt to the distribution of the grid, a two-tier scheduling model is adopted, namely global scheduling and PVS scheduling. As shown in Figure 3. Among them, there may be multiple PVS scheduling, which are distributed on different servers. When a user command is submitted to the global scheduler, the global scheduler assigns it to an appropriate PVS scheduler, and then the PVS scheduler executes a specific scheduler.

以天文数据处理中的快视分析为例。按照处理流程配置的快视分析PVS如图6所示。天文望远镜数据采集具有连续性，在一定时间范围内需要反复调用该PVS。系统启动后，调度方案产生模块不断利用网格系统中的闲散资源为快视分析PVS产生调度方案。而一旦有采集好的数据包到来，则在调度方案执行模块产生新的快视分析PVS执行实例，从相应的调度方案备选池中选取合适的调度方案进行资源预约和执行。Take snapshot analysis in astronomical data processing as an example. The snapshot analysis PVS configured according to the processing flow is shown in Figure 6. The data collection of astronomical telescopes is continuous, and the PVS needs to be called repeatedly within a certain time range. After the system is started, the dispatching plan generating module continuously utilizes the idle resources in the grid system to generate a dispatching plan for Quick View Analysis PVS. Once the collected data packets arrive, a new quick-view analysis PVS execution instance will be generated in the scheduling plan execution module, and a suitable scheduling plan will be selected from the corresponding scheduling plan candidate pool for resource reservation and execution.

Claims

1. A grid workflow virtual service scheduling method based on the open grid service architecture, characterized in that the method is implemented in the scheduling plan generator and the scheduling plan executor in all grid systems based on the open grid service architecture The generator and the executor are both composed of independent network processors to form a scheduling device, and the method is executed sequentially according to the following steps:

Step (1) Initialize

Establish a scheduling scheme generation module and a scheduling scheme execution module in the scheduling scheme generator and the executor; respectively define a same workflow virtual service six-tuple in the two modules, expressed as:

PVS=<set of working segments, description of dependencies, description of input parameters, description of output parameters, PVS authority, pool of alternative scheduling schemes>, where:

The set of working segments is a limited set, described as a basic working segment of PVS, which is dispatched to related resources and given specific services for execution only when PVS is actually executed; here "resource" refers to the grid system that can execute The specific services deployed in the system and the hardware and software resources required by the work segment;

Dependency description, which is a directed acyclic graph, used to describe the data dependency or control dependency between the various components in the set of work segments; a node on the way represents a work segment in PVS, and directed edges represent each Dependencies between components, the source node has no incoming edges, and the destination node has no outgoing edges;

PVS authority, representing the authority of PVS, the larger the value, the higher the priority; this item is optional;

Scheduling scheme candidate pool is a collection of PVS scheduling schemes; it contains at most Num ordered scheduling schemes for users to choose; the value of Num can be set when configuring PVS. A scheduling scheme=<P, makespan, security, performance>; among them, P is the specific scheduling scheme of the PVS; makespan is the predicted completion time of the PVS instance under the scheduling scheme; security and performance are respectively under the scheduling scheme The higher the value, the better the performance; the selected three values of makespan, security and performance are given by the scheduling algorithm specifically adopted by the user, and the workflow is determined by the specific dependencies of the work segment composition;

Input parameter description, output parameter description, using the World Wide Web Service Description Language WSDL file released by W3C, used to describe the parameter information required for PVS execution;

The scheduling scheme generation module generates a corresponding scheduling scheme for each PVS in the system, and is provided with: an interface with the scheduling scheme execution module, calling the scheduling algorithm resource pool and the scheduling algorithm resource pool provided by the user including the heuristic scheduling algorithm and The interface of the scheduling algorithm corresponding to the PVS submitted by the user, and the interface for sending the generated PVS to the scheduling scheme candidate pool in the PVS six-tuple in the scheduling scheme execution module or selecting the scheduling scheme from the scheduling scheme candidate pool;

The dispatching plan execution module is responsible for the execution of the example proposed by the user; it is provided with: an interface for receiving user PVS execution commands, an interface for receiving the dispatching plan provided by the dispatching plan generation module, and sending an interrupt signal to the dispatching plan generation module, and An interface for requesting each block to generate a request signal for an available scheduling scheme for the PVS proposed by the user;

Step (2) dispatching scheme produces module, judges whether there is the request signal that sends from dispatching execution module; If not, this dispatching scheme produces module and produces corresponding dispatching scheme for all PVS that system is set cyclically, and its steps are as follows successively:

Step (2.1) judges whether the system has idle resources, if not, then judges whether it is over; if there are idle resources, then executes the next step;

Step (2.2) selects the next PVS in sequence

Step (2.3) A scheduling scheme obtained according to a preset resource scheduling matching algorithm including a heuristic algorithm or a search algorithm selected by the user according to specific conditions;

Step (2.4) Calculate the three values of makespan, security and performance according to the selected scheduling algorithm in step (2.3), and compare their three values with the specified QoS standard; If the number of scheduling schemes is less than the set maximum storage number, then directly add the scheduling scheme obtained in step (2.3) into the scheduling scheme candidate pool; if the scheduling scheme number in the scheduling scheme candidate pool has reached the maximum storage number number, then replace a scheduling scheme that is worse than the scheduling scheme described in step (2.3), and update the scheduling scheme alternative pool of PVS;

Step (2.5), judge whether to have cycled all PVS once, if have cycled, just judge whether to end, otherwise, continue to circulate, until there is interrupt request signal to send from scheduling execution module;

Step (3), the dispatching plan executor selects a specific dispatching plan and executes the dispatching plan according to the received user PVS execution order according to the following steps;

Step (3.1) the scheduling scheme executor selects the corresponding scheduling scheme for the PVS from the scheduling scheme candidate pool according to the received user PVS execution order;

Step (3.2) judges whether there is an available scheduling scheme;

If: in the selected scheduling plan, even if only one resource required for the execution of the work segment is currently unavailable, the scheduling plan is considered invalid and deleted from the scheduling plan candidate pool until an available scheduling plan is found;

Step (3.3) Reserving and matching each work segment in the PVS to be executed by the user with the required resources according to the selected scheduling scheme, forming a PVS instance, and executing the PVS instance;

Step (3.4) if all the scheduling schemes in the scheduling scheme candidate pool are unavailable, then at this moment, the scheduling scheme execution module sends an interrupt request signal including the user PVS identification to the scheduling scheme generation module, and at a time specified by the user Within, accept the new scheduling scheme, and re-judge whether it is applicable to the PVS sent by the user.

2. The grid workflow virtual service scheduling method based on the open grid service architecture according to claim 1, characterized in that, the scheduling scheme execution module is provided with a reservation mechanism, and different grades are provided according to different levels of PVS. A user's PVS reserves resources of different capabilities.

3. The grid workflow virtual service scheduling method based on the open grid service architecture according to claim 1 is characterized in that, the affiliated scheduling program execution module adopts a two-layer scheduling mode of global scheduling and PVS scheduling, and the PVS The scheduling module is located on multiple different servers. When the user's PVS execution command is submitted to the global scheduling module, the global scheduling module assigns it to the appropriate PVS scheduling module. After that, the PVS scheduling module executes the specific scheduling plan.