CN102609303B

CN102609303B - Slow-task dispatching method and slow-task dispatching device of Map Reduce system

Info

Publication number: CN102609303B
Application number: CN201210016143.2A
Authority: CN
Inventors: 段翰聪; 聂晓文; 刘彬; 严华兵; 唐棠
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2012-01-18
Filing date: 2012-01-18
Publication date: 2014-12-17
Anticipated expiration: 2032-01-18
Also published as: CN102609303A

Abstract

The present invention provides a method and device for scheduling slow tasks in a MapReduce system, wherein the method includes: obtaining the computing power values of each computing node in the MapReduce system respectively, arranging each computing node into a queue of slow nodes according to the computing power value, and selecting the queue The first M computing nodes in the set are used as the target computing nodes; obtain the estimated computing power value of the target computing node after loading the slow tasks to be scheduled, and compare the target computing node and other computing nodes according to the computing power value from large to small Sequentially arrange into a new queue of slow nodes, and preset the Nth computing node starting from the end of the slow node queue as the evaluation reference node, where N is a natural number; the estimated computing power value of the target computing node is greater than When evaluating the computing capability value of the reference node, the slow task to be scheduled is scheduled to the target computing node. The invention effectively suppresses the jitter phenomenon during slow task scheduling.

Description

Slow task scheduling method and device for MapReduce system

技术领域 technical field

本发明涉及计算技术，尤其涉及一种MapReduce系统的慢任务调度方法和装置。The invention relates to computing technology, in particular to a slow task scheduling method and device for a MapReduce system.

背景技术 Background technique

MapReduce作为一种基本计算框架，被广泛使用在互联网应用中例如云计算环境。随着云计算环境的发展，逐渐采用虚拟化技术，一个物理主机上可能存在多个虚拟机；由于不同物理主机的计算能力存在较大差异，并且同一个物理主机上的各个虚拟机之间也存在着较大的性能差异，导致MapReduce系统的节点异构性非常突出，该节点异构性指的是不同计算节点的处理速度存在差异(一个虚拟机相当于一个计算节点)。计算任务调度到不同的计算节点，得到处理结果的响应时间是不同的，当MapReduce系统中的主控节点发现某计算节点对计算任务的执行速度太慢，并确定该计算任务为慢任务时，为提高处理速度，则将该慢任务调度到另一计算节点上同时执行，此称为慢任务的调度。As a basic computing framework, MapReduce is widely used in Internet applications such as cloud computing environments. With the development of the cloud computing environment, virtualization technology is gradually adopted, and there may be multiple virtual machines on a physical host; due to the large difference in the computing capabilities of different physical hosts, and the differences between virtual machines on the same physical host There is a large performance difference, which leads to the prominent node heterogeneity of the MapReduce system. The node heterogeneity refers to the difference in the processing speed of different computing nodes (one virtual machine is equivalent to one computing node). Computing tasks are scheduled to different computing nodes, and the response time for obtaining processing results is different. When the master control node in the MapReduce system finds that a computing node executes computing tasks too slowly and determines that the computing task is a slow task, In order to improve the processing speed, the slow task is scheduled to be executed on another computing node at the same time, which is called slow task scheduling.

具体的，MapReduce系统的其中一种实现机制Hadoop，其慢任务调度的方案为：Hadoop假设系统中的所有计算节点以相同速度处理计算任务，对计算任务定义了一个介于0和1之间的进程指标，并设定一个固定的门限值，只要进程指标满足所述门限值的设定条件，则确定该计算任务为慢任务，并执行慢任务调度。Hadoop是按照临近节点调度原则进行慢任务调度，即将慢任务调度到物理传输距离最近的临近节点上。上述方案的缺点是：有太多的计算任务被确定为慢任务，过多的慢任务调度占用了较多的系统资源；并且，临近调度原则还可能出现在调度后临近节点运行更慢，该慢任务还会进一步被调度，导致慢任务被多次调度，即出现了系统调度的抖动。Specifically, Hadoop, one of the implementation mechanisms of the MapReduce system, has a slow task scheduling scheme: Hadoop assumes that all computing nodes in the system process computing tasks at the same speed, and defines a value between 0 and 1 for computing tasks process index, and set a fixed threshold value, as long as the process index satisfies the setting condition of the threshold value, determine that the computing task is a slow task, and execute slow task scheduling. Hadoop performs slow task scheduling according to the adjacent node scheduling principle, that is, the slow task is scheduled to the adjacent node with the closest physical transmission distance. The disadvantage of the above scheme is: there are too many computing tasks identified as slow tasks, and too many slow task scheduling takes up more system resources; and, the adjacent scheduling principle may also appear that adjacent nodes run slower after scheduling. The slow task will be further scheduled, causing the slow task to be scheduled multiple times, that is, the jitter of system scheduling occurs.

MapReduce系统的另外一种实现机制LATE对上述的Hadoop进行了优化，其规定了一个慢任务调度比例，例如10％，只有10％的慢任务能够被调度，以避免过多的慢任务被调度而占用较多的系统资源。并且，LATE还根据计算任务完成所需的剩余时间定义了慢节点队列，规定该队列最后25％的节点为慢节点，不会将慢任务调度到所述的慢节点，以避免节点运行更慢。但是实践发现，LATE仍然不能解决系统调度的抖动问题，即使选择一个位于所述最后25％之外的计算节点进行慢任务调度，也仍然有可能使得该节点在加载该慢任务后计算能力下降太多而变慢，所调度的慢任务仍然是慢任务，将被再次调度，导致出现慢任务多次调度的系统抖动现象。LATE, another implementation mechanism of the MapReduce system, optimizes the above-mentioned Hadoop. It stipulates a slow task scheduling ratio, such as 10%, and only 10% of the slow tasks can be scheduled to avoid too many slow tasks being scheduled. Take up more system resources. Moreover, LATE also defines the slow node queue according to the remaining time required for the completion of computing tasks, and stipulates that the last 25% of the nodes in the queue are slow nodes, and will not schedule slow tasks to the slow nodes mentioned above to avoid nodes running more slowly . However, it has been found in practice that LATE still cannot solve the jitter problem of system scheduling. Even if a computing node located outside the last 25% is selected for slow task scheduling, it is still possible that the computing power of the node will drop too much after loading the slow task. If there are too many slow tasks, the scheduled slow tasks are still slow tasks and will be scheduled again, resulting in system jitter in which slow tasks are scheduled multiple times.

发明内容 Contents of the invention

本发明的第一个方面是提供一种MapReduce系统的慢任务调度方法，以有效抑制MapReduce系统慢任务调度时抖动现象的发生。The first aspect of the present invention is to provide a slow task scheduling method of the MapReduce system, so as to effectively suppress the jitter phenomenon during the slow task scheduling of the MapReduce system.

本发明的另一个方面是提供一种MapReduce系统的慢任务调度装置，以有效抑制MapReduce系统慢任务调度时抖动现象的发生。Another aspect of the present invention is to provide a slow task scheduling device of the MapReduce system, so as to effectively suppress the jitter phenomenon during the slow task scheduling of the MapReduce system.

本发明提供的MapReduce系统的慢任务调度方法，包括：The slow task scheduling method of the MapReduce system provided by the present invention comprises:

分别获取MapReduce系统中的各计算节点的计算能力值，所述计算节点的计算能力值所述v₁......v_m表示所述计算节点上的各计算任务的处理速度，所述表示所述各计算任务分别所属的工作的平均处理速度，所述m表示所述计算节点上的计算任务的总数量；Obtain the computing capability values of each computing node in the MapReduce system respectively, and the computing capability values of the computing nodes The v ₁ ...v _m represents the processing speed of each computing task on the computing node, the Indicates the average processing speed of the work to which each computing task belongs, and the m represents the total number of computing tasks on the computing node;

根据所述计算能力值从大到小的顺序将所述各计算节点排列为慢节点队列，并选取所述慢节点队列中的前M个计算节点，每个选取的所述计算节点作为目标计算节点，所述M为自然数；Arrange the computing nodes into a slow node queue according to the descending order of the computing capability values, and select the first M computing nodes in the slow node queue, and each selected computing node is used as the target computing node node, the M is a natural number;

分别获取M个所述目标计算节点在加载待调度的慢任务之后预估的计算能力值，所述预估的计算能力所述v_i表示目标计算节点上的待调度慢任务的处理速度，所述表示目标计算节点上的待调度慢任务所属的工作的平均处理速度；Obtain the estimated computing capability values of the M target computing nodes after loading the slow tasks to be scheduled, and the estimated computing capabilities The v _i represents the processing speed of the slow task to be scheduled on the target computing node, the Indicates the average processing speed of the work to which the slow task to be scheduled on the target computing node belongs;

根据所述目标计算节点的预估的计算能力值以及所述目标计算节点之外的各计算节点的计算能力值，将所述目标计算节点与所述目标计算节点之外的各计算节点按照计算能力值从大到小的顺序排列成新的慢节点队列，并预设所述慢节点队列中从队尾开始向前的第N个计算节点为评价基准节点，所述N为自然数；According to the estimated computing capability value of the target computing node and the computing capability values of computing nodes other than the target computing node, calculate the target computing node and computing nodes other than the target computing node Ability values are arranged in order from large to small into a new slow node queue, and the Nth computing node from the end of the slow node queue is preset as the evaluation benchmark node, and N is a natural number;

在所述目标计算节点预估的计算能力值大于所述评价基准节点的计算能力值时，将所述待调度的慢任务调度到所述目标计算节点。When the estimated computing capability value of the target computing node is greater than the computing capability value of the evaluation reference node, the slow task to be scheduled is scheduled to the target computing node.

本发明提供的MapReduce系统的慢任务调度装置，包括：参数获取单元、能力预估单元、队列排列单元和调度处理单元；The slow task scheduling device of the MapReduce system provided by the present invention includes: a parameter acquisition unit, a capacity estimation unit, a queue arrangement unit and a scheduling processing unit;

所述参数获取单元，用于分别获取MapReduce系统中的各计算节点的计算能力值，所述计算节点的计算能力值所述v₁......v_m表示所述计算节点上的各计算任务的处理速度，所述表示所述各计算任务分别所属的工作的平均处理速度，所述m表示所述计算节点上的计算任务的总数量；The parameter acquisition unit is used to respectively acquire the computing power value of each computing node in the MapReduce system, and the computing power value of the computing node The v ₁ ...v _m represents the processing speed of each computing task on the computing node, the Indicates the average processing speed of the work to which each computing task belongs, and the m represents the total number of computing tasks on the computing node;

所述能力预估单元，用于选取所述队列排列单元产生的慢节点队列中的前M个计算节点，每个选取的所述计算节点作为目标计算节点，所述M为自然数；并分别获取M个所述目标计算节点在加载待调度的慢任务之后预估的计算能力值，所述预估的计算能力所述v_i表示目标计算节点上的待调度慢任务的处理速度，所述表示目标计算节点上的待调度慢任务所属的工作的平均处理速度；The capability estimation unit is configured to select the first M computing nodes in the slow node queue generated by the queue arrangement unit, each of the selected computing nodes is used as a target computing node, and the M is a natural number; and respectively obtain The estimated computing capability value of the M target computing nodes after loading the slow tasks to be scheduled, the estimated computing capability The v _i represents the processing speed of the slow task to be scheduled on the target computing node, the Indicates the average processing speed of the work to which the slow task to be scheduled on the target computing node belongs;

所述队列排列单元，用于根据所述计算能力值从大到小的顺序将所述各计算节点排列为慢节点队列；以及，根据所述目标计算节点的预估的计算能力值以及所述目标计算节点之外的各计算节点的计算能力值，将所述目标计算节点与所述目标计算节点之外的各计算节点按照计算能力值从大到小的顺序排列成新的慢节点队列；The queue arranging unit is configured to arrange the computing nodes into a queue of slow nodes according to the descending order of the computing capability values; and, according to the estimated computing capability value of the target computing node and the Computing capability values of computing nodes other than the target computing node, arranging the target computing node and computing nodes other than the target computing node into a new slow node queue in descending order of computing capability values;

所述调度处理单元，用于预设所述慢节点队列中从队尾开始向前的第N个计算节点为评价基准节点，所述N为自然数；并在所述目标计算节点预估的计算能力值大于所述评价基准节点的计算能力值时，将所述待调度的慢任务调度到所述目标计算节点。The scheduling processing unit is configured to preset the Nth computing node starting from the end of the queue in the slow node queue as an evaluation reference node, where N is a natural number; and the estimated calculation of the target computing node When the capability value is greater than the computing capability value of the evaluation reference node, the slow task to be scheduled is scheduled to the target computing node.

本发明MapReduce系统的慢任务调度方法的技术效果是：通过在将慢任务调度至目标计算节点之前，预估该目标计算节点加载待调度的慢任务之后的计算能力值，并在该预估的计算能力值大于评价基准节点的计算能力值时，才将待调度的慢任务调度到所述目标计算节点，可以保证该目标计算节点在加载慢任务后不会使得计算能力下降太多，从而消除慢任务，防止了慢任务的多次调度，有效抑制了慢任务调度时抖动现象的发生。The technical effect of the slow task scheduling method of the MapReduce system of the present invention is: before the slow task is scheduled to the target computing node, the computing power value of the target computing node after loading the slow task to be scheduled is estimated, and the estimated When the computing power value is greater than the computing power value of the evaluation benchmark node, the slow task to be scheduled is scheduled to the target computing node, which can ensure that the target computing node will not reduce the computing power too much after loading the slow task, thereby eliminating The slow task prevents the multiple scheduling of the slow task, and effectively suppresses the jitter phenomenon when the slow task is scheduled.

本发明MapReduce系统的慢任务调度装置的技术效果是：通过在将慢任务调度至目标计算节点之前，预估该目标计算节点加载待调度的慢任务之后的计算能力值，并在该预估的计算能力值大于评价基准节点的计算能力值时，才将待调度的慢任务调度到所述目标计算节点，可以保证该目标计算节点在加载慢任务后不会使得计算能力下降太多，从而消除慢任务，防止了慢任务的多次调度，有效抑制了慢任务调度时抖动现象的发生。The technical effect of the slow task scheduling device of the MapReduce system of the present invention is: before the slow task is scheduled to the target computing node, the calculation capability value of the target computing node after loading the slow task to be scheduled is estimated, and the estimated When the computing power value is greater than the computing power value of the evaluation benchmark node, the slow task to be scheduled is scheduled to the target computing node, which can ensure that the target computing node will not reduce the computing power too much after loading the slow task, thereby eliminating The slow task prevents the multiple scheduling of the slow task, and effectively suppresses the jitter phenomenon when the slow task is scheduled.

附图说明 Description of drawings

图1为本发明MapReduce系统的慢任务调度方法实施例的流程示意图；Fig. 1 is the schematic flow chart of the slow task scheduling method embodiment of MapReduce system of the present invention;

图2为本发明MapReduce系统的慢任务调度装置实施例的结构示意图。FIG. 2 is a schematic structural diagram of an embodiment of a slow task scheduling device for a MapReduce system according to the present invention.

具体实施方式 Detailed ways

为了对本发明实施例的MapReduce系统的慢任务调度方法的说明更加清楚，首先对MapReduce系统的结构和工作原理做简单说明：In order to make the description of the slow task scheduling method of the MapReduce system of the embodiment of the present invention clearer, at first the structure and working principle of the MapReduce system are briefly described:

MapReduce系统通常包括一个主控节点(master)和多个计算节点(slave)；主控节点负责管理计算节点。主控节点接收客户端的数据计算请求，该请求的数据计算可以称为一个工作(Job)，工作可以有多种类型，例如，数据查询工作、数据平均工作等。主控节点会将工作拆分为多个计算任务(task)，并将计算任务分发至各计算节点，由各计算节点具体执行计算任务的处理。MapReduce系统将工作的处理分为两个阶段：Map阶段和Reduce阶段，即计算任务包括两种类型，Map类型的计算任务和Reduce类型的计算任务，Map阶段主要是将拆分的各计算任务分发到各计算节点处理，Reduce阶段则主要是将各计算节点的计算结果进行汇总；当所有计算节点的计算任务都做完时，主控节点将计算结果汇总并报告给客户端。其中，在处理过程中，计算节点与主控节点之间存在着心跳，计算节点可以将其计算任务的进度情况携带在心跳报文中通知主控节点；并在处理完计算任务处于空闲时，计算节点会主动向主控节点请求分配计算任务。A MapReduce system usually includes a master control node (master) and multiple computing nodes (slave); the master control node is responsible for managing the computing nodes. The master control node receives the client's data calculation request, and the requested data calculation can be called a job (Job), which can be of various types, for example, data query job, data averaging job, and so on. The master control node will split the work into multiple computing tasks (tasks), and distribute the computing tasks to each computing node, and each computing node will specifically execute the processing of the computing tasks. The MapReduce system divides work processing into two phases: the Map phase and the Reduce phase, that is, the computing tasks include two types, the Map type computing tasks and the Reduce type computing tasks, and the Map phase mainly distributes the split computing tasks After processing by each computing node, the Reduce stage mainly summarizes the calculation results of each computing node; when all computing tasks of the computing nodes are completed, the master control node summarizes the calculation results and reports them to the client. Among them, during the processing process, there is a heartbeat between the computing node and the master control node, and the computing node can carry the progress of its computing task in the heartbeat message to notify the master control node; and when the computing task is idle after processing, Computing nodes will actively request the distribution of computing tasks to the master control node.

本发明的实施例遵循如下假设条件：第一、假设MapReduce系统中的各计算节点是异构的；第二、假设MapReduce系统所处理的各工作是异构的：不同类型的工作之间差异明显，产生的数据量不同。Embodiments of the present invention follow the following assumptions: first, it is assumed that each computing node in the MapReduce system is heterogeneous; second, it is assumed that each job processed by the MapReduce system is heterogeneous: there are obvious differences between different types of jobs , the amount of data generated is different.

在以上介绍的基础上，下面对本发明实施例的慢任务调度方法和装置进行说明：On the basis of the above introduction, the slow task scheduling method and device of the embodiments of the present invention are described below:

图1为本发明MapReduce系统的慢任务调度方法实施例的流程示意图，需要说明的是，如下的101～103只是对该方法中所执行的各动作的列举，并没有对其之间的执行顺序做严格限制。如图1所示，该方法可以包括：Fig. 1 is a schematic flow chart of an embodiment of the slow task scheduling method of the MapReduce system of the present invention. It should be noted that the following 101-103 are only enumerations of the actions executed in the method, and there is no order of execution among them Be strictly limited. As shown in Figure 1, the method may include:

101、分别获取MapReduce系统中的各计算节点的计算能力值；101. Obtain the computing capability values of each computing node in the MapReduce system respectively;

本实施例中，MapReduce系统的主控节点可以获取各计算节点的计算能力值。可选的，主控节点可以按照如下获取方式得到各计算节点的计算能力：In this embodiment, the master control node of the MapReduce system can obtain the computing capability value of each computing node. Optionally, the master control node can obtain the computing capabilities of each computing node in the following way:

各计算节点可以计算其处理的计算任务的处理速度，例如，计算节点A上处理计算任务a、计算任务b；计算节点B上处理计算任务c、计算任务d；则计算节点A可以分别计算计算任务a、计算任务b的处理速度，计算节点B可以分别计算计算任务c、计算任务d的处理速度。Each computing node can calculate the processing speed of the computing task it handles. For example, computing task a and computing task b are processed on computing node A; computing task c and computing task d are processed on computing node B; then computing node A can compute computing tasks respectively To calculate the processing speeds of task a and task b, computing node B can calculate the processing speeds of task c and d respectively.

不同类型的计算任务，其处理速度有不同的计算方式。例如，假设计算节点A上所处理的计算任务a、计算任务b是Map类型的计算任务，计算节点B上所处理的计算任务c、计算任务d是Reduce类型的计算任务。则，对于Map类型的计算任务，其处理速度v＝p/t，其中，p为计算节点当前处理完成的计算任务的数据量，所述t为所述数据量的计算任务的处理耗时；对于Reduce类型的计算任务，其处理速度v＝p/t，其中，由于Reduce操作一般分为三个阶段：拷贝、排序和规约；如果计算任务正处于拷贝阶段，则如果计算任务处于排序阶段，则如果计算任务正处于归约阶段，则t为计算任务的处理耗时。Different types of computing tasks have different calculation methods for their processing speed. For example, assume that computing task a and computing task b processed on computing node A are computing tasks of the Map type, and computing task c and computing task d processed on computing node B are computing tasks of the Reduce type. Then, for a computing task of the Map type, its processing speed v=p/t, wherein, p is the data volume of the computing task currently processed by the computing node, and the t is the processing time of the computing task of the data volume; For the computing task of the Reduce type, its processing speed v=p/t, wherein, because the Reduce operation is generally divided into three stages: copying, sorting and statute; if the computing task is in the copying stage, then If the computation task is in the sorting phase, then If the computing task is in the reduction phase, then t is the processing time of the computing task.

各计算节点可以通过与主控节点之间的心跳消息，携带上述计算得到的计算任务的处理速度上报至主控节点。主控节点可以根据所述的处理速度，计算该计算任务所属的工作的平均处理速度。Each computing node can carry the processing speed of the computing task obtained by the above calculation and report to the master control node through a heartbeat message with the master control node. The master control node may calculate the average processing speed of the work to which the computing task belongs according to the processing speed.

例如，上述的计算任务a、计算任务c是由工作(job)G1拆分成的，即，主控节点在接收到客户端请求处理的工作G1后，将该G1拆分为计算任务a、计算任务c，并分发给所述的计算节点A和计算节点B处理；该两个计算节点将其上处理的计算任务的处理速度反馈给主控节点，主控节点就可以根据该处理速度得到工作G1的平均处理速度。所述平均处理速度所述v表示所述工作所拆分成的各计算任务的处理速度，所述n表示工作所拆分成的各计算任务所在的计算节点的总个数；假设计算节点A反馈的计算任务a的处理速度为v1，计算节点B反馈的计算任务c的处理速度为v2，则工作G1的平均处理速度同理，假设计算任务b和计算任务d是由工作G2拆分，计算任务b的处理速度为v3，计算任务d的处理速度为v4，则工作G2平均处理速度 For example, the above computing task a and computing task c are divided into jobs (job) G1, that is, after receiving the job G1 requested by the client, the master control node splits the job G1 into computing tasks a, computing task c, and distribute it to the computing node A and computing node B for processing; the two computing nodes feed back the processing speed of the computing task processed on them to the master control node, and the master control node can obtain The average processing speed of the job G1. The average processing speed The v represents the processing speed of each computing task split into by the work, and the n represents the total number of computing nodes where the computing tasks split into by the work are located; assuming computing task a fed back by computing node A The processing speed of the computing task c fed back by computing node B is v1, and the processing speed of the computing task c fed back by computing node B is v2, then the average processing speed of the work G1 Similarly, assuming that computing task b and computing task d are split by job G2, the processing speed of computing task b is v3, and the processing speed of computing task d is v4, then the average processing speed of job G2

在上述计算各工作的平均处理速度的基础上，各计算节点的计算能力可以按照如下公式计算：其中，v₁......v_m表示目标计算节点上的各计算任务的处理速度，表示各计算任务分别所属的工作的平均处理速度，所述m表示所述计算节点上的各计算任务的数量。例如，计算节点A的计算能力 On the basis of the above calculation of the average processing speed of each job, the computing power of each computing node can be calculated according to the following formula: Among them, v ₁ ...v _m represents the processing speed of each computing task on the target computing node, represents the average processing speed of the jobs to which each computing task belongs, and the m represents the number of each computing task on the computing node. For example, computing the computing power of node A

可选的，主控节点可以将计算得到的各工作的平均处理速度发送至各计算节点，由各计算节点得到自己的计算能力值，再将该计算能力上报至主控节点；或者，也可以由主控节点根据计算节点上报的处理速度、以及自己计算的各工作的平均处理速度，得到各计算节点的计算能力值。Optionally, the master control node can send the calculated average processing speed of each job to each computing node, and each computing node can obtain its own computing power value, and then report the computing power to the master control node; or, it can also The master control node obtains the computing capability value of each computing node according to the processing speed reported by the computing nodes and the average processing speed of each job calculated by itself.

本实施例通过上述的技术方案，提供了计算节点的计算能力的评价方法和评价指标，并且，该计算能力的计算中，考虑了不同类型工作的异构性和计算节点的异构性，即，考虑了不同计算节点的处理速度，以及，由于各个类型的工作不具有可比性，所以采用了来做归一化处理。上述方式使得计算节点的计算能力更加合理的得到反映。Through the above-mentioned technical solution, this embodiment provides an evaluation method and evaluation index for the computing power of computing nodes, and, in the computing of the computing power, the heterogeneity of different types of jobs and the heterogeneity of computing nodes are considered, that is, , considering the processing speed of different computing nodes, and, since the various types of jobs are not comparable, adopting for normalization. The above method makes the calculation capability of the calculation node more reasonable to be reflected.

102、根据计算能力值从大到小的顺序将所述各计算节点排列为慢节点队列，并选取所述慢节点队列中的前M个计算节点，每个选取的所述计算节点作为目标计算节点；并获取目标计算节点在加载待调度的慢任务之后预估的计算能力值；102. Arrange the computing nodes into a slow node queue according to the descending order of computing capability values, and select the first M computing nodes in the slow node queue, and use each selected computing node as a target computing node node; and obtain the estimated computing power value of the target computing node after loading the slow tasks to be scheduled;

可选的，主控节点可以在接收到MapReduce系统中的某个计算节点发送的申请计算任务消息时，开始进行慢任务的调度处理。Optionally, the master control node may start to schedule slow tasks when receiving a computing task application message sent by a computing node in the MapReduce system.

主控节点在获取到计算任务的处理速度以及计算节点的计算能力时，即可维护两个队列，慢任务队列和慢节点队列。其中，慢任务队列是主控节点根据计算任务的处理速度对MapReduce系统中的正在处理的计算任务进行排队；慢节点队列是主控节点根据计算节点的计算能力对各计算节点进行排队，可以是根据计算能力值从大到小的顺序将各计算节点排列为慢节点队列。并且，本实施例中，主控节点可以从慢任务队列的队尾开始，取最后10％的计算任务为慢任务；具体实施中，该比例值可以变化，例如，可以测量任务完成速度的方差，如果方差较小，可以适当缩减这个比例。When the master control node obtains the processing speed of the computing task and the computing power of the computing node, it can maintain two queues, the slow task queue and the slow node queue. Among them, the slow task queue is that the main control node queues the computing tasks being processed in the MapReduce system according to the processing speed of the computing tasks; the slow node queue is that the main control node queues the computing nodes according to the computing power of the computing nodes, which can be Arrange each computing node into a queue of slow nodes according to the descending order of computing power value. Moreover, in this embodiment, the master control node can start from the tail of the slow task queue, and take the last 10% of computing tasks as slow tasks; in specific implementation, this ratio can be changed, for example, the variance of task completion speed can be measured , if the variance is small, this ratio can be appropriately reduced.

本实施例中，会选取所述慢节点队列中的前M个计算节点，每个选取的所述计算节点作为目标计算节点，所述M为自然数；例如，具体实施中，可以选取慢节点队列中的前两个节点或者前三个节点等，数量可以自主设定，但是尽量选择队列中靠前的几个节点。需要将慢任务同时调度到上述的M个计算节点，上述的M个目标计算节点均为计划加载慢任务的节点。在将慢任务调度至上述的目标计算节点之前，将对目标计算节点加载慢任务之后的计算能力进行预评估。In this embodiment, the first M computing nodes in the slow node queue will be selected, and each selected computing node will be used as the target computing node, and the M is a natural number; for example, in specific implementation, the slow node queue can be selected The number of the first two nodes or the first three nodes in the queue can be set independently, but try to choose the first few nodes in the queue. The slow tasks need to be scheduled to the above M computing nodes at the same time, and the above M target computing nodes are all nodes that are scheduled to load the slow tasks. Before the slow task is scheduled to the above-mentioned target computing node, the computing capability of the target computing node after loading the slow task will be pre-evaluated.

具体的，目标计算节点的计算能力其中，所述v_i表示目标计算节点上的待调度慢任务的处理速度，所述表示目标计算节点上的待调度慢任务所属的工作的平均处理速度，所述v₁......v_m表示目标计算节点上的所述待调度慢任务之外的各计算任务的处理速度，所述表示所述待调度慢任务之外的各计算任务分别所属的工作的平均处理速度，所述m表示所述目标计算节点上的待调度慢任务之外的计算任务的数量。Specifically, the computing power of the target computing node Wherein, said v _i represents the processing speed of the slow task to be scheduled on the target computing node, said Indicates the average processing speed of the work to which the slow task to be scheduled on the target computing node belongs, and the v ₁ ... v _m represents the processing of each computing task other than the slow task to be scheduled on the target computing node speed, the Indicates the average processing speed of the jobs to which the computing tasks other than the slow tasks to be scheduled belong respectively, and the m represents the number of computing tasks other than the slow tasks to be scheduled on the target computing node.

例如，假设计划将计算任务c加载到计算节点A上，则预估计算节点A的计算能力 For example, assuming that computing task c is planned to be loaded on computing node A, the computing power of computing node A is estimated

103、将所述目标计算节点与所述目标计算节点之外的各计算节点按照计算能力值从大到小的顺序排列成新的慢节点队列；103. Arrange the target computing node and computing nodes other than the target computing node into a new slow node queue in descending order of computing capability values;

由于在102中对慢节点队列中的目标计算节点的计算能力进行了重新预估，所以本步骤中，将根据目标计算节点的预估后的计算能力，以及所述目标计算节点之外的各计算节点的计算能力值，对慢节点队列进行重新排队。Since the computing capability of the target computing node in the slow node queue was re-estimated in step 102, in this step, the estimated computing capability of the target computing node and each Calculate the computing power value of the node, and re-queue the queue of the slow node.

104、在所述目标计算节点预估的计算能力值大于所述评价基准节点的计算能力值时，将所述待调度的慢任务调度到所述目标计算节点。104. When the estimated computing capability value of the target computing node is greater than the computing capability value of the evaluation reference node, schedule the slow task to be scheduled to the target computing node.

本实施例中，预设103中重新排队的所述慢节点队列中从队尾开始向前的第N个计算节点为评价基准节点，所述N为自然数；例如，可以选择所述N与所述MapReduce系统中的计算节点的总数的比值为10％。即，假设MapReduce系统中有100个计算节点，该100个计算节点所组成的慢节点队列中，队列后边的一部分节点计算能力较低，处理速度较慢，可以称为慢节点；其中从队尾向前的倒数第10个计算节点则为评价基准节点。In this embodiment, the Nth computing node from the end of the slow node queue re-queued in preset 103 is the evaluation reference node, and the N is a natural number; for example, N and the The ratio of the total number of computing nodes in the MapReduce system is 10%. That is, assuming that there are 100 computing nodes in the MapReduce system, in the slow node queue composed of the 100 computing nodes, some nodes behind the queue have low computing power and slow processing speed, which can be called slow nodes; The 10th computing node from the bottom to the front is the evaluation benchmark node.

如果所述目标计算节点预估的计算能力值大于评价基准节点的计算能力值，则表明如果将待调度的慢任务加载到目标计算节点后，目标计算节点的计算能力会下降但是不会下降到慢节点队列的后10％，也相应的表明，由于目标计算节点的计算能力较强，因此，所加载的慢任务在目标计算节点上不会再成为新的慢任务，则将所述待调度的慢任务调度到所述目标计算节点。否则，表明如果将待调度的慢任务加载到目标计算节点，将使得目标计算节点的计算能力严重下降，必然会产生慢任务，则不会将待调度的慢任务调度到目标计算节点。If the estimated computing power value of the target computing node is greater than the computing power value of the evaluation reference node, it means that if the slow task to be scheduled is loaded to the target computing node, the computing power of the target computing node will drop but not drop to The last 10% of the slow node queue also correspondingly indicate that due to the strong computing power of the target computing node, the loaded slow task will no longer become a new slow task on the target computing node, and the pending scheduling The slow task is scheduled to the target computing node. Otherwise, it means that if the slow task to be scheduled is loaded to the target computing node, the computing power of the target computing node will be severely reduced, and slow tasks will inevitably be generated, and the slow task to be scheduled will not be scheduled to the target computing node.

通过在加载慢任务之前，对加载的目标计算节点的计算能力进行预估，可以提前预知该目标计算节点在加载慢任务之后的能力下降情况，可以得到加载后是否仍然会产生慢任务，并及时制止会产生慢任务的调度，防止出现慢任务多次调度的现象，从而有效抑制了抖动现象的发生，提高了MapReduce系统的性能；并且，由于及时制止了不合理的慢任务调度，也降低了慢任务调度所导致的资源占用量；此外，该方法通过将慢节点队列作为调度基础，根据节点加载慢任务后的预估计算能力进行重新排队，也可以控制慢任务再调度的比例。By estimating the computing power of the loaded target computing node before loading the slow task, the performance decline of the target computing node after loading the slow task can be predicted in advance, and whether the slow task will still be generated after loading can be obtained, and timely Stop the scheduling of slow tasks and prevent the phenomenon of multiple scheduling of slow tasks, thereby effectively suppressing the occurrence of jitter and improving the performance of the MapReduce system; Resource occupation caused by slow task scheduling; in addition, this method uses the slow node queue as the scheduling basis, and requeues according to the estimated computing power of the node after loading the slow task, and can also control the proportion of slow task rescheduling.

下面通过一组实验数据，说明本实施例的MapReduce系统的慢任务调度方法的效果：The effect of the slow task scheduling method of the MapReduce system of the present embodiment is illustrated below through a set of experimental data:

其中，本实施例进行了仿真实验，比较Hadoop、LATE、以及本实施例的慢任务调度方案的效果区别。仿真环境设置为：MapReduce系统有一个主控节点和50个计算节点；按照Hadoop模型，每个计算节点能同时处理10个计算能力；计算节点的计算能力按照下表1的分布选取。该仿真主要考察两个技术指标：MapReduce系统完成工作的响应时间、以及MapReduce系统发生抖动的次数(即统计经过两次以上调度的计算任务的个数)。In this embodiment, a simulation experiment is carried out to compare the effect difference between Hadoop, LATE, and the slow task scheduling scheme of this embodiment. The simulation environment is set as follows: the MapReduce system has a master control node and 50 computing nodes; according to the Hadoop model, each computing node can handle 10 computing capabilities at the same time; the computing capabilities of the computing nodes are selected according to the distribution in Table 1 below. The simulation mainly examines two technical indicators: the response time of the MapReduce system to complete the work, and the number of jitters in the MapReduce system (that is, counting the number of computing tasks that have been scheduled more than twice).

表1计算节点的计算能力分布Table 1 Computing capability distribution of computing nodes

物理机类型 Physical machine type 台数 Number of units

1 1 VMs/host VMs/host 202 202 2 2 VMs/host VMs/host 264 264 3 3 VMs/host VMs/host 201 201 4 4 VMs/host VMs/host 140 140 5 5 VMs/host VMs/host 45 45 6 6 VMs/host VMs/host 12 12 7 7 VMs/host VMs/host 7 7

以下的表2和表3为本次仿真实验中上述三种模型的响应时间对比表格和平均抖动次数表格，其中，Berkeley表示LATE方案，Patent表示本实施例方案；The following tables 2 and 3 are the response time comparison table and the average jitter times table of the above three models in this simulation experiment, where Berkeley represents the LATE scheme, and Patent represents the scheme of this embodiment;

表2响应时间(单位：tick)对比表格Table 2 Response time (unit: tick) comparison table

任务数 number of tasks Hadoop Hadoop Berkeley Berkeley Patent Patent 1000 1000 21749.4 21749.4 19990.2 19990.2 16814.6 16814.6 2000 2000 38381.2 38381.2 38223.6 38223.6 33608.8 33608.8 3000 3000 57929.4 57929.4 55130.6 55130.6 51802.3 51802.3 4000 4000 81571.7 81571.7 73063.4 73063.4 69270.5 69270.5 5000 5000 99111.3 99111.3 94638.1 94638.1 86405.8 86405.8

表3平均抖动次数表格Table 3 Average jitter times table

任务数 number of tasks Hadoop Hadoop Berkeley Berkeley Patent Patent 1000 1000 71.8 71.8 45.7 45.7 14.1 14.1 2000 2000 77.9 77.9 47.5 47.5 24 twenty four 3000 3000 79.8 79.8 50.4 50.4 42.6 42.6 4000 4000 90.3 90.3 57.7 57.7 45.1 45.1 5000 5000 94.5 94.5 62.2 62.2 49.3 49.3

从表2和表3可以了解到，本实施例的方案在完成时间和抖动方面都要优于Hadoop和LATE。It can be known from Table 2 and Table 3 that the solution of this embodiment is superior to Hadoop and LATE in terms of completion time and jitter.

本实施例的MapReduce系统的慢任务调度方法，通过在将慢任务调度至目标计算节点之前，预估该目标计算节点加载待调度的慢任务之后的计算能力值，并在该预估的计算能力值大于评价基准节点的计算能力值时，才将待调度的慢任务调度到所述目标计算节点，可以保证该目标计算节点在加载慢任务后不会使得计算能力下降太多，从而避免新的慢任务的产生，防止了慢任务的多次调度，有效抑制了慢任务调度时抖动现象的发生。In the slow task scheduling method of the MapReduce system in this embodiment, before scheduling the slow task to the target computing node, the computing power value of the target computing node after loading the slow task to be scheduled is estimated, and the estimated computing power When the value is greater than the computing power value of the evaluation reference node, the slow task to be scheduled is scheduled to the target computing node, which can ensure that the computing power of the target computing node will not drop too much after loading the slow task, thereby avoiding new The generation of slow tasks prevents multiple scheduling of slow tasks, and effectively suppresses the occurrence of jitter during slow task scheduling.

图2为本发明MapReduce系统的慢任务调度装置实施例的结构示意图，该装置可以执行本发明任意实施例的慢任务调度方法，本实施例只对该装置的结构做简单说明，具体的工作原理可以结合参见方法实施例所述。Fig. 2 is a schematic structural diagram of an embodiment of the slow task scheduling device of the MapReduce system of the present invention. The device can execute the slow task scheduling method of any embodiment of the present invention. This embodiment only briefly explains the structure of the device, and the specific working principle Reference may be made to the descriptions in the method embodiments.

如图2所示，本实施例的慢任务调度装置可以包括：参数获取单元21、能力预估单元22、队列排列单元23、调度处理单元24；其中，As shown in FIG. 2, the slow task scheduling device of this embodiment may include: a parameter acquisition unit 21, a capacity estimation unit 22, a queue arrangement unit 23, and a scheduling processing unit 24; wherein,

参数获取单元21，用于分别获取MapReduce系统中的各计算节点的计算能力值，所述计算节点的计算能力值所述v₁......v_m表示所述计算节点上的各计算任务的处理速度，所述表示所述各计算任务分别所属的工作的平均处理速度，所述m表示所述计算节点上的计算任务的总数量；The parameter obtaining unit 21 is used to respectively obtain the computing power value of each computing node in the MapReduce system, and the computing power value of the computing node The v ₁ ...v _m represents the processing speed of each computing task on the computing node, the Indicates the average processing speed of the work to which each computing task belongs, and the m represents the total number of computing tasks on the computing node;

能力预估单元22，用于选取所述队列排列单元产生的慢节点队列中的前M个计算节点，每个选取的所述计算节点作为目标计算节点，所述M为自然数；并分别获取M个所述目标计算节点在加载待调度的慢任务之后预估的计算能力值，所述预估的计算能力所述v_i表示目标计算节点上的待调度慢任务的处理速度，所述表示目标计算节点上的待调度慢任务所属的工作的平均处理速度；The capacity estimation unit 22 is configured to select the first M computing nodes in the slow node queue generated by the queue arrangement unit, each of the selected computing nodes is used as a target computing node, and the M is a natural number; and obtain M respectively. The estimated computing capability value of each target computing node after loading the slow task to be scheduled, and the estimated computing capability The v _i represents the processing speed of the slow task to be scheduled on the target computing node, the Indicates the average processing speed of the work to which the slow task to be scheduled on the target computing node belongs;

队列排列单元23，用于根据所述计算能力值从大到小的顺序将所述各计算节点排列为慢节点队列；以及，根据所述目标计算节点的预估的计算能力值以及所述目标计算节点之外的各计算节点的计算能力值，将所述目标计算节点与所述目标计算节点之外的各计算节点按照计算能力值从大到小的顺序排列成新的慢节点队列；A queue arrangement unit 23, configured to arrange the computing nodes into a queue of slow nodes according to the descending order of the computing capability values; and, according to the estimated computing capability value of the target computing node and the target calculating the computing capability values of computing nodes other than the computing node, and arranging the target computing node and computing nodes other than the target computing node into a new slow node queue in order of computing capability values from large to small;

调度处理单元24，用于预设所述慢节点队列中从队尾开始向前的第N个计算节点为评价基准节点，所述N为自然数；并在所述目标计算节点预估的计算能力值大于所述评价基准节点的计算能力值时，将所述待调度的慢任务调度到所述目标计算节点。The scheduling processing unit 24 is configured to preset the Nth computing node starting from the end of the queue in the slow node queue as the evaluation reference node, where N is a natural number; and the estimated computing power of the target computing node When the value is greater than the computing capability value of the evaluation reference node, the slow task to be scheduled is scheduled to the target computing node.

可选的，参数获取单元21可以包括：速度接收子单元211、平均处理子单元212、能力获取子单元213；其中，Optionally, the parameter acquisition unit 21 may include: a speed receiving subunit 211, an average processing subunit 212, and a capability acquisition subunit 213; wherein,

速度接收子单元211，用于接收所述MapReduce系统中的各计算节点上报的计算任务的处理速度，所述计算任务为在计算节点上处理的计算任务；The speed receiving subunit 211 is configured to receive the processing speed of the calculation task reported by each calculation node in the MapReduce system, and the calculation task is a calculation task processed on the calculation node;

平均处理子单元212，用于根据所述处理速度得到所述计算任务所属的工作的平均处理速度，所述平均处理速度所述v表示所述工作所拆分成的各所述计算任务的所述处理速度，所述n表示所述工作所拆分成的各计算任务所在的计算节点的总个数；The average processing subunit 212 is configured to obtain the average processing speed of the work to which the computing task belongs according to the processing speed, and the average processing speed The v represents the processing speed of each of the computing tasks into which the work is split, and the n represents the total number of computing nodes where the computing tasks into which the work is split are located;

能力获取子单元213，用于将所述工作的平均处理速度分别发送至所述各计算节点，并接收所述各计算节点上报的根据所述工作的平均处理速度得到的所述各计算节点的计算能力值；或者，用于根据所述工作的平均处理速度得到所述各计算节点的计算能力值。The capability acquisition subunit 213 is configured to send the average processing speed of the job to the computing nodes respectively, and receive the average processing speed of the computing nodes obtained according to the average processing speed of the job reported by the computing nodes. A computing capability value; or, used to obtain the computing capability value of each computing node according to the average processing speed of the work.

可选的，速度接收子单元211，具体用于在计算任务的类型为Map类型时，接收所述各计算节点计算得到的对应所述Map类型的计算任务的处理速度，所述处理速度v＝p/t，所述p为所述计算节点当前处理完成的计算任务的数据量，所述t为所述数据量的计算任务的处理耗时。Optionally, the speed receiving subunit 211 is specifically configured to receive the processing speed of the computing task corresponding to the Map type calculated by each computing node when the computing task type is a Map type, and the processing speed v= p/t, the p is the data amount of the computing task currently processed by the computing node, and the t is the processing time of the computing task of the data amount.

可选的，本实施例的装置还可以包括：调度触发单元25，用于在获取MapReduce系统中的目标计算节点在加载待调度的慢任务之后预估的计算能力值之前，接收所述MapReduce系统中的计算节点发送的申请计算任务消息。Optionally, the device in this embodiment may further include: a scheduling trigger unit 25, configured to receive the MapReduce system before obtaining the estimated computing capability value of the target computing node in the MapReduce system after loading the slow tasks to be scheduled. The application computing task message sent by the computing nodes in .

本实施例的MapReduce系统的慢任务调度装置，通过在将慢任务调度至目标计算节点之前，预估该目标计算节点加载待调度的慢任务之后的计算能力值，并在该预估的计算能力值大于评价基准节点的计算能力值时，才将待调度的慢任务调度到所述目标计算节点，可以保证该目标计算节点在加载慢任务后不会使得计算能力下降太多，从而避免新的慢任务的产生，防止了慢任务的多次调度，有效抑制了慢任务调度时抖动现象的发生。The slow task scheduling device of the MapReduce system in this embodiment, before scheduling the slow task to the target computing node, estimates the computing power value of the target computing node after loading the slow task to be scheduled, and uses the estimated computing power When the value is greater than the computing power value of the evaluation reference node, the slow task to be scheduled is scheduled to the target computing node, which can ensure that the computing power of the target computing node will not drop too much after loading the slow task, thereby avoiding new The generation of slow tasks prevents multiple scheduling of slow tasks, and effectively suppresses the occurrence of jitter during slow task scheduling.

本领域普通技术人员可以理解：实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时，执行包括上述各方法实施例的步骤；而前述的存储介质包括：ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps for implementing the above method embodiments can be completed by program instructions and related hardware. The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, it executes the steps including the above-mentioned method embodiments; and the aforementioned storage medium includes: ROM, RAM, magnetic disk or optical disk and other various media that can store program codes.

最后应说明的是：以上各实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述各实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分或者全部技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention, rather than limiting them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present invention. scope.

Claims

1. a slow method for scheduling task for Map Reduce system, is characterized in that, comprising:

Obtain the computing power value of each computing node in Map Reduce system respectively, the computing power value of described each computing node described v ₁... v _mrepresent the processing speed of each calculation task on described each computing node, described in represent the average treatment speed of the work belonging to described each calculation task respectively, described m represents the total quantity of the calculation task on described each computing node;

According to described computing power value order from big to small, described each computing node is arranged as slow node queue, and chooses front M computing node in described slow node queue, each described computing node chosen is as target computing nodes, and described M is natural number;

Obtain the computing power value that M described target computing nodes is estimated after loading slow task to be scheduled respectively, described in the computing power value estimated

\tilde{c} = (v_{1} / {\overset{&OverBar;}{v}}_{1} + v_{2} / {\overset{&OverBar;}{v}}_{2} + . . . + v_{m} / {\overset{&OverBar;}{v}}_{m} + v_{i} / {\overset{&OverBar;}{v}}_{i}) / (m + 1),

Described v _irepresent and target computing nodes waits the processing speed of dispatching slow task, described in represent and target computing nodes waits the average treatment speed of dispatching the work belonging to slow task;

According to the computing power value of each computing node outside the computing power value estimated of described target computing nodes and described target computing nodes, each computing node outside described target computing nodes and described target computing nodes is arranged in new slow node queue according to computing power value order from big to small, and the N number of computing node preset in described slow node queue from tail of the queue is forward metewand node, described N is natural number;

When the computing power value that described target computing nodes is estimated is greater than the computing power value of described metewand node, by described slow task scheduling to be scheduled to described target computing nodes.

2. the slow method for scheduling task of Map Reduce system according to claim 1, is characterized in that, the computing power value of each computing node in described acquisition Map Reduce system, comprising:

Receive the processing speed of the calculation task that each described computing node in described Map Reduce system reports, described calculation task is the calculation task processed on described computing node;

The average treatment speed of the work belonging to described calculation task is obtained, described average treatment speed according to described processing speed described v represents the described processing speed of each calculation task that described work splits into, and described n represents total number of the computing node at each calculation task place of the split one-tenth of described work;

The average treatment speed of described work is sent to described each computing node respectively, and the computing power value of the described each computing node obtained according to the average treatment speed of described work receiving that described each computing node reports.

3. the slow method for scheduling task of Map Reduce system according to claim 1, is characterized in that, the computing power value of each computing node in described acquisition Map Reduce system, comprising:

The average treatment speed of the work belonging to described calculation task is obtained, described average treatment speed according to described processing speed described v represents the described processing speed of each described calculation task of the split one-tenth of described work, and described n represents total number of the computing node at each calculation task place of the split one-tenth of described work;

The computing power value of described each computing node is obtained according to the average treatment speed of described work.

4. according to the slow method for scheduling task of the arbitrary described Map Reduce system of claim 1-3, it is characterized in that, before the computing power value that described acquisition target computing nodes is estimated after loading slow task to be scheduled, also comprise:

Receive the application calculation task message that the computing node in described Map Reduce system sends.

5. the slow method for scheduling task of Map Reduce system according to claim 1, is characterized in that, the ratio of the sum of the computing node in described N and described Map Reduce system is 10%.

6. a slow task scheduling apparatus for Map Reduce system, is characterized in that, comprising: parameter acquiring unit, ability estimate unit, queue column unit and scheduling processing unit;

Described parameter acquiring unit, for obtaining the computing power value of each computing node in Map Reduce system respectively, the computing power value of described each computing node described v ₁... v _mrepresent the processing speed of each calculation task on described each computing node, described in represent the average treatment speed of the work belonging to described each calculation task respectively, described m represents the total quantity of the calculation task on described each computing node;

Described ability estimates unit, for choose described queue column unit produce slow node queue in before M computing node, each described computing node chosen is as target computing nodes, and described M is natural number; And obtain the computing power value that M described target computing nodes estimates after loading slow task to be scheduled respectively, described in the computing power value estimated

\tilde{c} = (v_{1} / {\overset{&OverBar;}{v}}_{1} + v_{2} / {\overset{&OverBar;}{v}}_{2} + . . . + v_{m} / {\overset{&OverBar;}{v}}_{m} + v_{i} / {\overset{&OverBar;}{v}}_{i}) / (m + 1),

Described queue column unit, for being arranged as slow node queue according to described computing power value order from big to small by described each computing node; And, according to the computing power value of each computing node outside the computing power value estimated of described target computing nodes and described target computing nodes, each computing node outside described target computing nodes and described target computing nodes is arranged in new slow node queue according to computing power value order from big to small;

Described scheduling processing unit, be metewand node for the N number of computing node preset in described slow node queue from tail of the queue forward, described N is natural number; And when the computing power value that described target computing nodes is estimated is greater than the computing power value of described metewand node, by described slow task scheduling to be scheduled to described target computing nodes.

7. the slow task scheduling apparatus of Map Reduce system according to claim 6, is characterized in that, described parameter acquiring unit comprises:

Speed receives subelement, the processing speed of the calculation task that each described computing node for receiving in described Map Reduce system reports, and described calculation task is the calculation task processed on computing node;

Average treatment subelement, for obtaining the average treatment speed of the work belonging to described calculation task, described average treatment speed according to described processing speed described v represents the described processing speed of each described calculation task of the split one-tenth of described work, and described n represents total number of the computing node at each calculation task place of the split one-tenth of described work;

Ability obtains subelement, for the average treatment speed of described work is sent to described each computing node respectively, and the computing power value of the described each computing node obtained according to the average treatment speed of described work receiving that described each computing node reports; Or, for obtaining the computing power value of described each computing node according to the average treatment speed of described work.

8. the slow task scheduling apparatus of the Map Reduce system according to claim 6 or 7, is characterized in that, also comprise:

Scheduling trigger element, for before the computing power value that the target computing nodes in described acquisition Map Reduce system is estimated after loading slow task to be scheduled, receives the application calculation task message that the computing node in described Map Reduce system sends.