[go: up one dir, main page]

CN104935523B - The processing method and equipment of a kind of load balancing - Google Patents

The processing method and equipment of a kind of load balancing Download PDF

Info

Publication number
CN104935523B
CN104935523B CN201410108066.2A CN201410108066A CN104935523B CN 104935523 B CN104935523 B CN 104935523B CN 201410108066 A CN201410108066 A CN 201410108066A CN 104935523 B CN104935523 B CN 104935523B
Authority
CN
China
Prior art keywords
task
migration
migrated
time information
working node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410108066.2A
Other languages
Chinese (zh)
Other versions
CN104935523A (en
Inventor
邓超
郭磊涛
钱岭
孙少陵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201410108066.2A priority Critical patent/CN104935523B/en
Publication of CN104935523A publication Critical patent/CN104935523A/en
Application granted granted Critical
Publication of CN104935523B publication Critical patent/CN104935523B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种负载均衡的处理方法和设备,包括:获取一个工作节点的一个迭代周期内每一个任务的运行时间信息;根据获取的每一个任务的运行时间信息以及待迁移任务的分析策略,确定工作节点在迭代周期内的待迁移任务;当确定为待迁移任务的次数超过设定数值时,将待迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点,这样利用任务迁移方式,在工作点的多次迭代操作中执行一次任务迁移,有效地实现工作点间任务负载均衡,避免了Giraph中负载均衡策略存在时间消耗的问题,同时也避免了Hadoop中推测式执行策略存在增加消息通信网络开销的问题,提高了BSP模型中任务迁移的效率,提升了系统的负载均衡性。

The invention discloses a processing method and equipment for load balancing, comprising: obtaining the running time information of each task in an iteration cycle of a working node; according to the obtained running time information of each task and an analysis strategy for the task to be migrated , determine the task to be migrated by the working node within the iteration period; when the number of tasks determined to be migrated exceeds the set value, migrate the task to be migrated to a task other than the working node whose task processing volume is less than the set threshold In this way, the task migration method is used to perform a task migration in multiple iterations of the work point, effectively achieving task load balancing between the work points, avoiding the problem of time consumption in the load balancing strategy in Giraph, and also avoiding Hadoop The speculative execution strategy in the middle has the problem of increasing the network overhead of message communication, which improves the efficiency of task migration in the BSP model and improves the load balance of the system.

Description

一种负载均衡的处理方法和设备A processing method and device for load balancing

技术领域technical field

本发明涉及无线通信技术领域,尤其涉及一种负载均衡的处理方法和设备。The present invention relates to the technical field of wireless communication, in particular to a load balancing processing method and equipment.

背景技术Background technique

BSP(Bulk-Synchronous Parallel,大容量同步并行)模型是基于大数据迭代处理的模型,有别于Map Reduce模型。在BSP模型中数据处理被分为若干任务,每个任务经历着同样的迭代阶段:同步阶段、计算阶段和消息通信阶段。The BSP (Bulk-Synchronous Parallel) model is a model based on iterative processing of big data, which is different from the Map Reduce model. In the BSP model, data processing is divided into several tasks, and each task goes through the same iterative phase: synchronization phase, calculation phase and message communication phase.

对于BSP模型中并行执行的多任务,动态负载均衡是并行计算中的一个关键点。但是,BSP模型在并行环境中高频迭代场景下,不可避免地存在负载不均衡的情况。For the multitasks executed in parallel in the BSP model, dynamic load balancing is a key point in parallel computing. However, the BSP model inevitably has unbalanced load in the high-frequency iteration scenario in the parallel environment.

目前针对负载不均衡这一个问题提出了两种解决方案:一种是Giraph中的负载均衡策略;另一种是Hadoop中的推测式执行策略。At present, two solutions are proposed for the problem of load imbalance: one is the load balancing strategy in Giraph; the other is the speculative execution strategy in Hadoop.

其中,Giraph中的负载均衡策略,即为数据的动态划分策略。在进行数据的动态划分时采用三种处理方式:第一种是静态划分;第二种是根据边数平衡;第三种是根据顶点数平衡。Among them, the load balancing strategy in Giraph is the dynamic data partitioning strategy. Three processing methods are used in the dynamic division of data: the first is static division; the second is balanced according to the number of edges; the third is balanced according to the number of vertices.

具体地,使用一种简单的启发式算法实现工作点之间数据的动态划分均衡。包括:第一步,对所有的数据分片按照平衡值由大到小进行排序;第二步,根据各个工作点的总平衡值,将工作点进行排序形成工作点堆;第三步,将数据分片从大到小依次放入工作点堆的堆顶中,每放完一个数据分片后,并将放了数据分片的工作点再次放入工作点堆中,这样保证每次将数据分片放入已用容量最小的工作点中;第四步,返回平衡后的数据分片的信息列表;第五步,根据返回的平衡后的数据分片的信息列表,进行数据的动态划分迁移。Specifically, a simple heuristic algorithm is used to realize the dynamic division and balance of data among working points. Including: the first step, sort all the data fragments according to the balance value from large to small; the second step, sort the work points according to the total balance value of each work point to form a work point pile; the third step, the The data fragments are put into the top of the work point heap in order from large to small. After each data fragment is placed, the work point with the data fragment placed is put into the work point heap again, so as to ensure that each time the The data fragmentation is placed in the working point with the smallest used capacity; the fourth step is to return the information list of the balanced data fragmentation; the fifth step is to perform data dynamics according to the returned information list of the balanced data fragmentation Divide the migration.

Hadoop中的推测式执行策略,主要是Hadoop检测一次作业中存在的慢任务,并分别对检测到的慢任务启动Map任务进行推测式执行,最后从慢任务中选取执行最快的任务的输出作为Reduce任务的输入。The speculative execution strategy in Hadoop is mainly that Hadoop detects the slow tasks that exist in a job, and performs speculative execution on the detected slow tasks to start the Map task respectively, and finally selects the output of the fastest-executing task from the slow tasks as The input of the Reduce task.

综上所述,在BSP模型中采用Giraph中的负载均衡策略,每次迭代计算前都需要进行数据迁移,这样调整需要花费大量的时间作为负载均衡的代价;在BSP模型中采用Hadoop中的推测式执行策略,由于推测任务从0超步开始,与对应的慢任务属于不同超步,无法进行当前超步的消息通信,因此,需要增加消息通信网络开销。To sum up, the load balancing strategy in Giraph is used in the BSP model, and data migration is required before each iterative calculation, so that the adjustment takes a lot of time as the cost of load balancing; the speculation in Hadoop is used in the BSP model Since the speculative task starts from superstep 0, which is different from the corresponding slow task, the message communication of the current superstep cannot be carried out. Therefore, the network overhead of message communication needs to be increased.

发明内容Contents of the invention

有鉴于此,本发明实施例提供了一种负载均衡的处理方法和设备,用于解决目前在BSP模型中采用Giraph中的负载均衡策略存在消耗时间的问题,以及在在BSP模型中采用Hadoop中的推测式执行策略存在增加消息通信网络开销的问题。In view of this, the embodiment of the present invention provides a load balancing processing method and equipment, which are used to solve the time-consuming problem of using the load balancing strategy in Giraph in the BSP model at present, and the problem of using Hadoop in the BSP model The speculative execution strategy of , has the problem of increasing the network overhead of message communication.

一种负载均衡的处理方法,包括:A processing method for load balancing, comprising:

获取一个工作节点的一个迭代周期内每一个任务的运行时间信息;Obtain the running time information of each task in an iteration cycle of a worker node;

根据获取的所述每一个任务的运行时间信息以及待迁移任务的分析策略,确定所述工作节点在所述迭代周期内的待迁移任务;According to the obtained running time information of each task and the analysis strategy of the task to be migrated, determine the task to be migrated of the working node within the iteration period;

当确定为待迁移任务的次数超过设定数值时,将所述待迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点。When it is determined that the number of tasks to be migrated exceeds a set value, the task to be migrated is migrated to a working node other than the working node whose task processing volume is less than a set threshold.

根据获取的所述每一个任务的运行时间信息以及待迁移任务的分析策略,确定所述工作节点在所述迭代周期内的待迁移任务,包括:According to the obtained running time information of each task and the analysis strategy of the task to be migrated, determining the task to be migrated of the working node in the iteration cycle includes:

根据获取的所述每一个任务的运行时间信息,将所述每一个任务的运行时间信息进行排序;sorting the running time information of each task according to the acquired running time information of each task;

利用得到的排序信息,确定所述工作节点在所述迭代周期内的准迁移任务;并Using the obtained sorting information, determine the quasi-migration tasks of the worker nodes within the iteration period; and

利用迁移代价模型,计算确定的准迁移任务的迁移收益值;Using the migration cost model, calculate the migration benefit value of the determined quasi-migration task;

对于迁移收益值大于设定门限值的准迁移任务,当迁移收益值大于设定门限值的准迁移任务所在的工作节点在设定时间内的任务处理量大于设定阈值时,确定迁移收益值大于设定门限值的准迁移任务为所述工作节点在所述迭代周期内的待迁移任务。For quasi-migration tasks whose migration income value is greater than the set threshold value, when the task processing volume of the working node where the migration income value is greater than the set threshold value is greater than the set threshold value within the set time, the migration is determined. The quasi-migration tasks whose revenue value is greater than the set threshold value are tasks to be migrated by the working node within the iteration period.

利用得到的排序信息,确定所述工作节点在所述迭代周期内的准迁移任务,包括:Using the obtained sorting information, determine the quasi-migration tasks of the working nodes in the iteration period, including:

针对排序信息中的每一个运行时间信息,在确定运行时间信息大于设定运行时间信息时,确定大于设定运行时间信息对应的任务为所述工作节点在所述迭代周期内的准迁移任务;For each run time information in the sorting information, when it is determined that the run time information is greater than the set run time information, determine that the task corresponding to the set run time information is the quasi-migration task of the working node within the iteration period;

其中,所述设定运行时间信息通过以下方式确定:Wherein, the set running time information is determined in the following manner:

T=T2+(T2-T1)*1.5;T为设定运行时间信息,T2为根据得到的排序信息,确定排序信息中四分之三位处对应的时间信息;T1为根据得到的排序信息,确定排序信息中四分之一位处对应的运行时间信息。T=T 2 +(T 2 -T 1 )*1.5; T is the set running time information, T 2 is the time information corresponding to three-quarters of the sorting information determined according to the obtained sorting information; T 1 is According to the obtained sorting information, the running time information corresponding to a quarter position in the sorting information is determined.

利用迁移代价模型,计算确定的准迁移任务的迁移收益值,包括:Using the migration cost model, calculate the migration benefit value of the determined quasi-migration task, including:

通过以下方式计算确定的准迁移任务的迁移收益值:Calculate the migration benefit value for the identified quasi-migration tasks by:

G(T)=T.remainSuperStep*(T.runTime-avgRunTime)-T.migrateCostG(T)= T.remainSuperStep *( T.runTime -avgRunTime) -T.migrateCost ;

其中,G(T)为确定的准迁移任务的迁移收益值,T.remainSuperStep为确定的准迁移任务的剩余超步运行时间信息,T.runTime为确定的准迁移任务的运行时间信息,avgRunTime为所述工作节点在所述迭代周期内非准迁移任务的平均运行时间信息,T.migrateCost为确定的准迁移任务的迁移代价时间信息,等于数据加载的时间长度与消息读或者写的时间长度之和。Among them, G(T) is the migration benefit value of the determined quasi-migration task, T.remainSuperStep is the remaining superstep running time information of the determined quasi-migration task, T.runTime is the running time information of the determined quasi-migration task, and avgRunTime is The average running time information of the non-quasi-migration task of the working node in the iteration cycle, T. migrateCost is the migration cost time information of the determined quasi-migration task, which is equal to the time length of data loading and the time length of message reading or writing and.

当确定为待迁移任务的次数超过设定数值时,将所述待迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点,包括:When it is determined that the number of tasks to be migrated exceeds a set value, migrating the task to be migrated to a working node other than the working node whose task processing volume is less than a set threshold includes:

判断所述工作节点在连续N个迭代周期内是否存在确定为待迁移任务的次数超过设定数值的待迁移任务;Judging whether the working node has a task to be migrated that is determined to be a task to be migrated for a number of times exceeding a set value within N consecutive iteration cycles;

当存在确定为待迁移任务的次数超过设定数值的待迁移任务时,判定确定为待迁移任务的次数超过设定数值的待迁移任务为迁移任务,并将判定的所述迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点中。When there is a task to be migrated whose number of times is determined to be a task to be migrated exceeds a set value, it is determined that a task to be migrated whose number of times is determined to be a task to be migrated exceeds a set value is a migration task, and the determined migration task is migrated to a location other than In the working nodes other than the working nodes, the task processing volume is smaller than the set threshold.

一种负载均衡的处理设备,包括:A load balancing processing device comprising:

获取模块,用于获取一个工作节点的一个迭代周期内每一个任务的运行时间信息;An acquisition module, configured to acquire the running time information of each task in an iteration cycle of a working node;

确定模块,用于根据获取的所述每一个任务的运行时间信息以及待迁移任务的分析策略,确定所述工作节点在所述迭代周期内的待迁移任务;A determining module, configured to determine the task to be migrated of the working node within the iteration period according to the obtained running time information of each task and the analysis strategy of the task to be migrated;

迁移模块,用于当确定为待迁移任务的次数超过设定数值时,将所述待迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点。The migration module is configured to migrate the task to be migrated to a working node other than the working node whose task processing volume is less than a set threshold when it is determined that the number of times of the task to be migrated exceeds a set value.

所述确定模块,具体用于根据获取的所述每一个任务的运行时间信息,将所述每一个任务的运行时间信息进行排序;利用得到的排序信息,确定所述工作节点在所述迭代周期内的准迁移任务;并利用迁移代价模型,计算确定的准迁移任务的迁移收益值;对于迁移收益值大于设定门限值的准迁移任务,当迁移收益值大于设定门限值的准迁移任务所在的工作节点在设定时间内的任务处理量大于设定阈值时,确定迁移收益值大于设定门限值的准迁移任务为所述工作节点在所述迭代周期内的待迁移任务。The determining module is specifically configured to sort the running time information of each task according to the acquired running time information of each task; use the obtained sorting information to determine that the working node is in the iterative cycle and use the migration cost model to calculate the migration income value of the determined quasi-migration task; for the quasi-migration task whose migration income value is greater than the set threshold value, when the migration income value is greater than the set threshold value of the quasi-migration task When the task processing volume of the working node where the migration task is located is greater than the set threshold within the set time, it is determined that the quasi-migration task whose migration benefit value is greater than the set threshold is the task to be migrated by the working node within the iteration period .

所述确定模块,具体用于针对排序信息中的每一个运行时间信息,在确定运行时间信息大于设定运行时间信息时,确定大于设定运行时间信息对应的任务为所述工作节点在所述迭代周期内的准迁移任务;The determining module is specifically configured to, for each piece of running time information in the sorting information, when it is determined that the running time information is greater than the set running time information, determine that the task corresponding to the greater than the set running time information is the task of the working node in the quasi-migration tasks within the iterative cycle;

其中,所述设定运行时间信息通过以下方式确定:Wherein, the set running time information is determined in the following manner:

T=T2+(T2-T1)*1.5;T为设定运行时间信息,T2为根据得到的排序信息,确定排序信息中四分之三位处对应的时间信息;T1为根据得到的排序信息,确定排序信息中四分之一位处对应的运行时间信息。T=T 2 +(T 2 -T 1 )*1.5; T is the set running time information, T 2 is the time information corresponding to three-quarters of the sorting information determined according to the obtained sorting information; T 1 is According to the obtained sorting information, the running time information corresponding to a quarter position in the sorting information is determined.

所述确定模块,具体用于通过以下方式计算确定的准迁移任务的迁移收益值:The determination module is specifically used to calculate the migration benefit value of the determined quasi-migration task in the following manner:

G(T)=T.remainSuperStep*(T.runTime-avgRunTime)-T.migrateCostG(T)= T.remainSuperStep *( T.runTime -avgRunTime) -T.migrateCost ;

其中,G(T)为确定的准迁移任务的迁移收益值,T.remainSuperStep为确定的准迁移任务的剩余超步运行时间信息,T.runTime为确定的准迁移任务的运行时间信息,avgRunTime为所述工作节点在所述迭代周期内非准迁移任务的平均运行时间信息,T.migrateCost为确定的准迁移任务的迁移代价时间信息,等于数据加载的时间长度与消息读或者写的时间长度之和。Among them, G(T) is the migration benefit value of the determined quasi-migration task, T.remainSuperStep is the remaining superstep running time information of the determined quasi-migration task, T.runTime is the running time information of the determined quasi-migration task, and avgRunTime is The average running time information of the non-quasi-migration task of the working node in the iteration cycle, T. migrateCost is the migration cost time information of the determined quasi-migration task, which is equal to the time length of data loading and the time length of message reading or writing and.

所述迁移模块,具体用于判断所述工作节点在连续N个迭代周期内是否存在确定为待迁移任务的次数超过设定数值的待迁移任务;The migration module is specifically used to judge whether there are tasks to be migrated in the working node that are determined to be migrated tasks whose number of times exceeds a set value within N consecutive iteration cycles;

当存在确定为待迁移任务的次数超过设定数值的待迁移任务时,判定确定为待迁移任务的次数超过设定数值的待迁移任务为迁移任务,并将判定的所述迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点中。When there is a task to be migrated whose number of times is determined to be a task to be migrated exceeds a set value, it is determined that a task to be migrated whose number of times is determined to be a task to be migrated exceeds a set value is a migration task, and the determined migration task is migrated to a location other than In the working nodes other than the working nodes, the task processing volume is smaller than the set threshold.

本发明有益效果如下:The beneficial effects of the present invention are as follows:

本发明实施例获取一个工作节点的一个迭代周期内每一个任务的运行时间信息;根据获取的所述每一个任务的运行时间信息以及待迁移任务的分析策略,确定所述工作节点在所述迭代周期内的待迁移任务;当确定为待迁移任务的次数超过设定数值时,将所述待迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点,这样通过根据任务的运行时间信息和待迁移任务的分析策略,筛选出所述工作节点的待迁移任务,并当确定为待迁移任务的次数超过设定数值时,启动将该待迁移任务迁移至了所述工作节点之外的任务处理量小于设定阈值的工作节点,也就是说利用任务迁移方式,在工作点的多次迭代操作中执行一次任务迁移,有效地实现工作点间任务负载均衡,避免了采用Giraph中的负载均衡策略存在时间消耗的问题,以及采用实时监测任务运行时间信息的方式,避免了采用Hadoop中的推测式执行策略存在增加消息通信网络开销的问题,有效地提高了BSP模型中任务迁移的效率,提升了系统的负载均衡性。The embodiment of the present invention obtains the running time information of each task in an iteration cycle of a working node; according to the acquired running time information of each task and the analysis strategy of the task to be migrated, determine that the working node is in the iteration Tasks to be migrated within the period; when it is determined that the number of tasks to be migrated exceeds a set value, the task to be migrated is migrated to a working node whose task processing capacity is less than the set threshold except the working node, so that by According to the running time information of the task and the analysis strategy of the task to be migrated, the task to be migrated of the working node is screened out, and when the number of tasks determined to be migrated exceeds the set value, the migration of the task to be migrated to all The processing capacity of tasks other than the above-mentioned working nodes is less than the set threshold. That is to say, the task migration method is used to perform a task migration in multiple iterations of the working point, which can effectively achieve task load balancing among the working points and avoid The problem of time consumption in the load balancing strategy in Giraph is eliminated, and the method of real-time monitoring of task running time information avoids the problem of increasing message communication network overhead in the speculative execution strategy in Hadoop, effectively improving the BSP model The efficiency of medium task migration improves the load balance of the system.

附图说明Description of drawings

图1为本发明实施例一提供的一种负载均衡的方法的流程示意图;FIG. 1 is a schematic flowchart of a load balancing method provided by Embodiment 1 of the present invention;

图2为确定所述工作节点在所述迭代周期内的待迁移任务的流程示意图;Fig. 2 is a schematic flow chart of determining the task to be migrated of the working node within the iterative period;

图3为本发明实施例二提供的一种负载均衡的方法的流程示意图。FIG. 3 is a schematic flowchart of a load balancing method provided by Embodiment 2 of the present invention.

具体实施方式Detailed ways

为了实现本发明的目的,本发明实施例提供了一种负载均衡的方法和设备,通过获取一个工作节点的一个迭代周期内每一个任务的运行时间信息;根据获取的所述每一个任务的运行时间信息以及待迁移任务的分析策略,确定所述工作节点在所述迭代周期内的待迁移任务;当确定为待迁移任务的次数超过设定数值时,将所述待迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点,这样通过根据任务的运行时间信息和待迁移任务的分析策略,筛选出所述工作节点的待迁移任务,并当确定为待迁移任务的次数超过设定数值时,启动将该待迁移任务迁移至了所述工作节点之外的任务处理量小于设定阈值的工作节点,也就是说利用任务迁移方式,在工作点的多次迭代操作中执行一次任务迁移,有效地实现工作点间任务负载均衡,避免了采用Giraph中的负载均衡策略存在时间消耗的问题,以及采用实时监测任务运行时间信息的方式,避免了采用Hadoop中的推测式执行策略存在增加消息通信网络开销的问题,有效地提高了BSP模型中任务迁移的效率,提升了系统的负载均衡性。In order to achieve the purpose of the present invention, the embodiment of the present invention provides a load balancing method and device, by obtaining the running time information of each task in an iteration cycle of a working node; according to the acquired running time information of each task Time information and the analysis strategy of the tasks to be migrated, determine the tasks to be migrated of the working node in the iteration period; when the number of tasks determined to be migrated exceeds the set value, migrate the tasks to be migrated The task processing capacity of other than the above-mentioned working nodes is less than the set threshold. In this way, according to the running time information of the tasks and the analysis strategy of the tasks to be migrated, the tasks to be migrated of the working nodes are screened out, and when determined as the tasks to be migrated When the number of tasks exceeds the set value, start to migrate the task to be migrated to a working node other than the working node whose task processing volume is less than the set threshold, that is to say, using the task migration method, multiple Perform a task migration in the iterative operation, effectively realize task load balancing among work points, avoid the problem of time consumption in the load balancing strategy in Giraph, and use real-time monitoring of task running time information to avoid the use of Hadoop The speculative execution strategy has the problem of increasing the network overhead of message communication, which effectively improves the efficiency of task migration in the BSP model and improves the load balance of the system.

下面结合说明书附图对本发明各个实施例进行详细描述。Various embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

实施例一:Embodiment one:

如图1所示,为本发明实施例一提供的一种负载均衡的方法的流程示意图。所述方法可以如下所述。As shown in FIG. 1 , it is a schematic flowchart of a load balancing method provided by Embodiment 1 of the present invention. The method can be described as follows.

步骤101:获取一个工作节点的一个迭代周期内每一个任务的运行时间信息。Step 101: Obtain the running time information of each task in an iteration period of a worker node.

在步骤101中,在工作节点的任务开始运行时,统计当前迭代周期内每一个任务的运行时间信息、本地任务的数据量信息以及对本次迭代周期内产生的用于下一个迭代周期的消息量进行预估计得到的预估计量信息(还可以通过读/写检查点的时间信息表示)。In step 101, when the task of the working node starts to run, the running time information of each task in the current iteration cycle, the data volume information of the local task, and the messages for the next iteration cycle generated in this iteration cycle The estimated amount information obtained by pre-estimating the amount (it can also be represented by the time information of the read/write checkpoint).

需要说明的是,统计得到的本地任务的数据量信息以及对本次迭代周期内产生的用于下一个迭代周期的消息量进行预估计得到的预估计量信息,将涉及到计算任务迁移时的迁移代价。It should be noted that the data volume information of the local tasks obtained through statistics and the estimated volume information obtained by pre-estimating the amount of messages generated in this iteration cycle for the next iteration cycle will involve the calculation task migration. Migration costs.

当工作点在本次迭代周期内执行的任务有M个,那么统计得到的每一个任务的运行时间信息为:T1,T2,T3,……,TmWhen there are M tasks executed by the working point in this iteration period, then the statistically obtained running time information of each task is: T 1 , T 2 , T 3 , . . . , T m .

步骤102:根据获取的所述每一个任务的运行时间信息以及待迁移任务的分析策略,确定所述工作节点在所述迭代周期内的待迁移任务。Step 102: According to the obtained running time information of each task and the analysis policy of the task to be migrated, determine the task to be migrated of the working node within the iteration period.

在步骤102中,根据获取的所述每一个任务的运行时间信息以及待迁移任务的分析策略,确定所述工作节点在所述迭代周期内的待迁移任务的方式包括但不限于以下方式:In step 102, according to the obtained running time information of each task and the analysis strategy of the task to be migrated, the manner of determining the task to be migrated of the working node in the iteration cycle includes but not limited to the following manners:

如图2所示,为确定所述工作节点在所述迭代周期内的待迁移任务的流程示意图。As shown in FIG. 2 , it is a schematic flowchart of determining tasks to be migrated of the working nodes within the iteration period.

步骤201:根据获取的所述每一个任务的运行时间信息,将所述每一个任务的运行时间信息进行排序。Step 201: sort the running time information of each task according to the acquired running time information of each task.

在步骤201中,在获取到所述每一个任务的运行时间信息时,按照设定的排序规则,将所述每一个任务的运行时间信息进行排序。In step 201, when the running time information of each task is acquired, the running time information of each task is sorted according to a set sorting rule.

例如:设定的排序规则为按照运行时间信息从小到大的顺序;或者为按照运行时间信息从大到小的顺序;这里不做限定。For example: the set sorting rule is the order from small to large according to the running time information; or the order from large to small according to the running time information; there is no limitation here.

具体地,假设获取到所述每一个任务的运行时间信息(其中,下标表示为任务标识)为T1=1s,T2=2s,T3=4s,T4=5s,T5=7s,T6=3.5s,T7=1.5s,T8=9s,T9=3s,T10=0.5s,T11=4.5s,T12=6s,按照运行时间信息从小到大的顺序进行排序,得到的序列为:{T10,T1,T7,T2,T9,T6,T3,T11,T4,T12,T5,T8}。Specifically, it is assumed that the obtained running time information of each task (wherein, the subscript represents the task identifier) is T 1 =1s, T 2 =2s, T 3 =4s, T 4 =5s, T 5 =7s , T 6 =3.5s, T 7 =1.5s, T 8 =9s, T 9 =3s, T 10 =0.5s, T 11 =4.5s, T 12 =6s, according to the order of running time information from small to large Sorting, the obtained sequence is: {T 10 , T 1 , T 7 , T 2 , T 9 , T 6 , T 3 , T 11 , T 4 , T 12 , T 5 , T 8 }.

步骤202:利用得到的排序信息,确定所述工作节点在所述迭代周期内的准迁移任务。Step 202: Using the obtained sorting information, determine the quasi-migration tasks of the working nodes within the iteration period.

在步骤202中,针对排序信息中的每一个运行时间信息,在确定运行时间信息大于设定运行时间信息时,确定大于设定运行时间信息对应的任务为所述工作节点在所述迭代周期内的准迁移任务。In step 202, for each piece of running time information in the sorting information, when it is determined that the running time information is greater than the set running time information, it is determined that the task corresponding to the set running time information is greater than the set running time information as the task of the working node within the iteration cycle quasi-migration tasks.

其中,所述设定运行时间信息通过以下方式确定:Wherein, the set running time information is determined in the following manner:

T=T2+(T2-T1)*1.5;T为设定运行时间信息,T2为根据得到的排序信息,确定排序信息中四分之三位处对应的时间信息;T1为根据得到的排序信息,确定排序信息中四分之一位处对应的运行时间信息。T=T 2 +(T 2 -T 1 )*1.5; T is the set running time information, T 2 is the time information corresponding to three-quarters of the sorting information determined according to the obtained sorting information; T 1 is According to the obtained sorting information, the running time information corresponding to a quarter position in the sorting information is determined.

具体地,首先,根据得到的排序信息,确定排序信息中四分之一位处对应的第一运行时间信息和四分之三位处对应的第二运行时间信息。Specifically, firstly, according to the obtained sorting information, the first running time information corresponding to the 1/4 position and the second running time information corresponding to the 3/4 position in the sorting information are determined.

仍以上述事例为例,得到的序列为:{T10,T1,T7,T2,T9,T6,T3,T11,T4,T12,T5,T8},根据得到的序列信息,四分之一位处对应的第一运行时间信息为T7,即任务标识为7对应的运行时间信息,四分之三位处对应的第二运行时间信息为T4,即任务标识为4对应的运行时间信息。Still taking the above example as an example, the obtained sequence is: {T 10 , T 1 , T 7 , T 2 , T 9 , T 6 , T 3 , T 11 , T 4 , T 12 , T 5 , T 8 }, According to the obtained sequence information, the first running time information corresponding to the quarter position is T 7 , that is, the running time information corresponding to the task ID 7, and the second running time information corresponding to the three-quarter position is T 4 , that is, the running time information corresponding to task ID 4.

其次,针对排序信息中的每一个运行时间信息,判断运行时间信息是否大于设定的运行阈值,若运行时间信息大于设定的运行阈值,则确定该运行时间信息对应的任务为所述工作节点在所述迭代周期内的准迁移任务;否则,确定该运行时间信息对应的任务不是所述工作节点在所述迭代周期内的准迁移任务。Secondly, for each running time information in the sorting information, it is judged whether the running time information is greater than the set running threshold, and if the running time information is greater than the set running threshold, then it is determined that the task corresponding to the running time information is the working node A quasi-migration task within the iterative period; otherwise, it is determined that the task corresponding to the running time information is not a quasi-migration task of the working node within the iterative period.

其中,设定的运行阈值可以为第二运行时间信息与(第二运行时间信息-第一运行时间信息)*1.5之和,还可以是通过实验数据确定的,或者通过实际经验确定的,这里不做限定。Wherein, the set operating threshold may be the sum of the second operating time information and (second operating time information-first operating time information)*1.5, and may also be determined through experimental data or actual experience, where No limit.

具体地,在确定运行时间信息大于【第二运行时间信息+(第二运行时间信息-第一运行时间信息)*1.5】时,确定大于【第二运行时间信息+(第二运行时间信息-第一运行时间信息)*1.5】的运行时间信息对应的任务为所述工作节点在所述迭代周期内的准迁移任务。Specifically, when it is determined that the running time information is greater than [second running time information+(second running time information-first running time information)*1.5], it is determined that it is greater than [second running time information+(second running time information- The task corresponding to the running time information of the first running time information)*1.5] is the quasi-migration task of the working node in the iteration period.

例如:仍以上述事例为例,根据得到的序列信息,四分之一位处对应的第一运行时间信息为T7,即任务标识为7对应的运行时间信息,四分之三位处对应的第二运行时间信息为T4,即任务标识为4对应的运行时间信息,得到的设定的运行阈值为7.75s,因此,运行时间信息大于7.75s的任务为任务8,也就是说,在本事例中,确定的所述工作节点在所述迭代周期内的准迁移任务为任务8。For example: still taking the above example as an example, according to the obtained sequence information, the first running time information corresponding to the quarter position is T 7 , that is, the running time information corresponding to the task ID 7, and the three-quarter position corresponds to The second running time information of T 4 is T 4 , that is, the running time information corresponding to task ID 4, and the set running threshold value obtained is 7.75s. Therefore, the task whose running time information is greater than 7.75s is task 8, that is, In this example, the determined quasi-migration task of the working node within the iteration period is task 8 .

步骤203:利用迁移代价模型,计算确定的准迁移任务的迁移收益值。Step 203: Using the migration cost model, calculate the migration benefit value of the determined quasi-migration task.

在步骤203中,通过以下方式计算确定的准迁移任务的迁移收益值:In step 203, the migration benefit value of the determined quasi-migration task is calculated in the following manner:

G(T)=T.remainSuperStep*(T.runTime-avgRunTime)-T.migrateCostG(T)= T.remainSuperStep *( T.runTime -avgRunTime) -T.migrateCost ;

其中,G(T)为确定的准迁移任务的迁移收益值,T.remainSuperStep为确定的准迁移任务的剩余超步运行时间信息,T.runTime为确定的准迁移任务的运行时间信息,avgRunTime为所述工作节点在所述迭代周期内非准迁移任务的平均运行时间信息,T.migrateCost为确定的准迁移任务的迁移代价时间信息,等于数据加载的时间长度与消息读或者写的时间长度之和。Among them, G(T) is the migration benefit value of the determined quasi-migration task, T.remainSuperStep is the remaining superstep running time information of the determined quasi-migration task, T.runTime is the running time information of the determined quasi-migration task, and avgRunTime is The average running time information of the non-quasi-migration task of the working node in the iteration cycle, T. migrateCost is the migration cost time information of the determined quasi-migration task, which is equal to the time length of data loading and the time length of message reading or writing and.

需要说明的是,T.migrateCost是在步骤101中获取每一个任务的运行时间信息时得到的迁移代价信息,可以通过图数据与消息的读/写检查点的时间得到。其中,图数据的读写时间可以用读写检查点图数据的时间(或图数据加载时间的2倍)近似模拟;通过读取本次迭代所用的消息数据量大小来评估消息数据的读/写时间,计算公式为:读写消息数据时间=读写图数据时间/图数据大小(字节)*总消息数据量大小(字节)。It should be noted that T.migrateCost is the migration cost information obtained when the running time information of each task is obtained in step 101, which can be obtained from the graph data and message read/write checkpoint time. Among them, the reading and writing time of graph data can be approximated by the time of reading and writing checkpoint graph data (or twice the loading time of graph data); the read/write time of message data can be evaluated by reading the amount of message data used in this iteration. Write time, the calculation formula is: read and write message data time = read and write map data time / map data size (bytes) * total message data size (bytes).

步骤204:对于迁移收益值大于设定门限值的准迁移任务,当迁移收益值大于设定门限值的准迁移任务所在的工作节点在设定时间内的任务处理量大于设定阈值时,确定迁移收益值大于设定门限值的准迁移任务为所述工作节点在所述迭代周期内的待迁移任务。Step 204: For quasi-migration tasks whose migration income value is greater than the set threshold value, when the task processing volume of the working node where the migration income value is greater than the set threshold value is greater than the set threshold value within the set time , it is determined that the quasi-migration task whose migration benefit value is greater than the set threshold is the task to be migrated by the working node within the iterative period.

其中,设定门限值一般为0,也就意味着迁移收益值可能是负值,但是对于步骤204的研究对象是迁移收益值大于0对应的任务。Wherein, the threshold value is generally set to 0, which means that the transfer benefit value may be a negative value, but the research object of step 204 is a task corresponding to a transfer benefit value greater than 0.

在步骤204中,在得到迁移收益值大于设定门限值的准迁移任务之后,需要对准迁移任务所在的工作点在设定时间内的任务处理量进行判断,若准迁移任务所在的工作点在设定时间内的任务处理量不大,使得该工作点空闲资源比较多,此时确定的准迁移任务将不作为待迁移任务考虑;只有在准迁移任务所在的工作点在设定时间内的任务处理量大于设定阈值时,意味着设定时间内工作点的负载比较重,由必要迁移任务,减轻工作点的任务处理压力,所以此时确定的准迁移任务将被作为所述工作节点在所述迭代周期内的待迁移任务。In step 204, after obtaining the quasi-migration task whose migration benefit value is greater than the set threshold value, it is necessary to judge the task processing capacity of the working point where the quasi-migrating task is located within the set time. The task processing capacity of the point within the set time is not large, so that the work point has more idle resources. At this time, the quasi-migration task determined at this time will not be considered as the task to be migrated; only when the work point where the quasi-migration task is located is within the set time When the amount of task processing within is greater than the set threshold, it means that the load of the working point is relatively heavy within the set time, and the task processing pressure of the working point is reduced by the necessary migration tasks, so the quasi-migration tasks determined at this time will be used as the described Tasks to be migrated of the working nodes within the iteration period.

其中,设定时间可以为确定的准迁移任务的剩余超步运行时间与工作待点内任务平均运行时间之积(即avgRunTime*T.remainSuperStep)的20%的时间,也可以是实验数据或者通过实际需要确定,这里不做限定。Among them, the set time can be 20% of the product of the remaining superstep running time of the determined quasi-migration task and the average running time of the tasks in the work waiting point (that is, avgRunTime* T.remainSuperStep ), or it can be experimental data or through It needs to be determined actually, and there is no limitation here.

设定阈值可以是工作点任务处理量的二分之一,也可以是实验数据或者通过实际需要确定,这里不做限定。The set threshold can be one-half of the task processing capacity of the working point, or can be determined by experimental data or actual needs, and is not limited here.

这样有效避免了工作点中大量任务迁移的可能性,降低了系统资源的消耗,提升了任务迁移的准确度。This effectively avoids the possibility of a large number of task migration in the work point, reduces the consumption of system resources, and improves the accuracy of task migration.

步骤103:当确定为待迁移任务的次数超过设定数值时,将所述待迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点。Step 103: When it is determined that the number of tasks to be migrated exceeds a set value, migrate the task to be migrated to a working node other than the working node whose task processing volume is less than a set threshold.

在步骤103中,首先,判断所述工作节点在连续N个迭代周期内是否存在确定为待迁移任务的次数超过设定数值的待迁移任务。In step 103, firstly, it is judged whether there is a task to be migrated in the working node within N consecutive iteration periods, the number of times of which the task is determined to be migrated exceeds a set value.

其中,N为自然数,且小于待迁移任务所在工作点的总迭代周期数。Among them, N is a natural number and is less than the total number of iteration cycles of the working point where the task to be migrated is located.

这里,避免了一次迭代统计数据错误导致确定迁移任务存在误差,提升了任务迁移的精确度。Here, an error in determining the migration task due to an iterative statistical data error is avoided, and the accuracy of the task migration is improved.

其次,当存在确定为待迁移任务的次数超过设定数值的待迁移任务时,判定确定为待迁移任务的次数超过设定数值的待迁移任务为迁移任务,并将判定的所述迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点中。Secondly, when there is a task to be migrated whose number of tasks determined to be migrated exceeds a set value, it is determined that the task to be migrated whose number of times exceeds a set value is determined to be a migration task, and the determined migration task is migrated To the working nodes other than the working nodes whose task processing volume is less than the set threshold.

通过本发明实施例一的方案,获取一个工作节点的一个迭代周期内每一个任务的运行时间信息;根据获取的所述每一个任务的运行时间信息以及待迁移任务的分析策略,确定所述工作节点在所述迭代周期内的待迁移任务;当确定为待迁移任务的次数超过设定数值时,将所述待迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点,这样通过根据任务的运行时间信息和待迁移任务的分析策略,筛选出所述工作节点的待迁移任务,并当确定为待迁移任务的次数超过设定数值时,启动将该待迁移任务迁移至了所述工作节点之外的任务处理量小于设定阈值的工作节点,也就是说利用任务迁移方式,在工作点的多次迭代操作中执行一次任务迁移,有效地实现工作点间任务负载均衡,避免了采用Giraph中的负载均衡策略存在时间消耗的问题,以及采用实时监测任务运行时间信息的方式,避免了采用Hadoop中的推测式执行策略存在增加消息通信网络开销的问题,有效地提高了BSP模型中任务迁移的效率,提升了系统的负载均衡性。Through the solution of Embodiment 1 of the present invention, the running time information of each task in one iteration cycle of a working node is obtained; according to the obtained running time information of each task and the analysis strategy of the task to be migrated, the work is determined The task to be migrated by the node within the iterative period; when it is determined that the number of tasks to be migrated exceeds the set value, the task to be migrated is migrated to a task whose processing volume is less than the set threshold except the working node In this way, according to the running time information of the task and the analysis strategy of the task to be migrated, the task to be migrated of the working node is screened out, and when the number of tasks determined to be migrated exceeds the set value, the task to be migrated is started. The task is migrated to a working node other than the working node whose task processing volume is less than the set threshold, that is to say, the task migration method is used to perform a task migration in multiple iterations of the working point, effectively realizing the task between the working points. Task load balancing avoids the time consumption problem of using the load balancing strategy in Giraph, and adopts the method of real-time monitoring of task running time information to avoid the problem of increasing message communication network overhead in Hadoop's speculative execution strategy, which is effective It greatly improves the efficiency of task migration in the BSP model and improves the load balance of the system.

实施例二:Embodiment two:

如图3所示,为本发明实施例二提供的一种负载均衡的处理设备的结构示意图,本发明实施例二是与本发明实施例一属于同一发明构思下的发明,所述设备包括:获取模块11、确定模块12和迁移模块13,其中:As shown in Figure 3, it is a schematic structural diagram of a load balancing processing device provided by Embodiment 2 of the present invention. Embodiment 2 of the present invention is an invention under the same inventive concept as Embodiment 1 of the present invention. The device includes: Obtaining module 11, determining module 12 and migration module 13, wherein:

获取模块11,用于获取一个工作节点的一个迭代周期内每一个任务的运行时间信息;An acquisition module 11, configured to acquire the running time information of each task in an iteration cycle of a working node;

确定模块12,用于根据获取的所述每一个任务的运行时间信息以及待迁移任务的分析策略,确定所述工作节点在所述迭代周期内的待迁移任务;A determining module 12, configured to determine the task to be migrated of the working node within the iteration period according to the acquired running time information of each task and the analysis strategy of the task to be migrated;

迁移模块13,用于当确定为待迁移任务的次数超过设定数值时,将所述待迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点。The migration module 13 is configured to migrate the task to be migrated to a working node other than the working node whose task processing capacity is less than a set threshold when it is determined that the number of times of the task to be migrated exceeds a set value.

具体地,所述确定模块12,具体用于根据获取的所述每一个任务的运行时间信息,将所述每一个任务的运行时间信息进行排序;利用得到的排序信息,确定所述工作节点在所述迭代周期内的准迁移任务;并利用迁移代价模型,计算确定的准迁移任务的迁移收益值;对于迁移收益值大于设定门限值的准迁移任务,当迁移收益值大于设定门限值的准迁移任务所在的工作节点在设定时间内的任务处理量大于设定阈值时,确定迁移收益值大于设定门限值的准迁移任务为所述工作节点在所述迭代周期内的待迁移任务。Specifically, the determining module 12 is specifically configured to sort the running time information of each task according to the acquired running time information of each task; use the obtained sorting information to determine that the working node is quasi-migration tasks within the iterative cycle; and use the migration cost model to calculate the migration income value of the determined quasi-migration task; for quasi-migration tasks whose migration income value is greater than the set threshold value, when the migration income value is greater than the set threshold When the task processing capacity of the working node where the quasi-migration task of the limit value is located is greater than the set threshold within the set time, it is determined that the quasi-migration task whose migration benefit value is greater than the set threshold value is that the working node is within the iteration period. tasks to be migrated.

所述确定模块12,具体用于针对排序信息中的每一个运行时间信息,在确定运行时间信息大于设定运行时间信息时,确定大于设定运行时间信息对应的任务为所述工作节点在所述迭代周期内的准迁移任务;The determining module 12 is specifically configured to, for each piece of running time information in the sorting information, when it is determined that the running time information is greater than the set running time information, determine that the task corresponding to the set running time information is greater than the set running time information as the task of the working node. quasi-migration tasks within the above iteration cycle;

其中,所述设定运行时间信息通过以下方式确定:Wherein, the set running time information is determined in the following manner:

T=T2+(T2-T1)*1.5;T为设定运行时间信息,T2为根据得到的排序信息,确定排序信息中四分之三位处对应的时间信息;T1为根据得到的排序信息,确定排序信息中四分之一位处对应的运行时间信息。T=T 2 +(T 2 -T 1 )*1.5; T is the set running time information, T 2 is the time information corresponding to three-quarters of the sorting information determined according to the obtained sorting information; T 1 is According to the obtained sorting information, the running time information corresponding to a quarter position in the sorting information is determined.

所述确定模块12,具体用于通过以下方式计算确定的准迁移任务的迁移收益值:The determination module 12 is specifically configured to calculate the migration benefit value of the determined quasi-migration task in the following manner:

G(T)=T.remainSuperStep*(T.runTime-avgRunTime)-T.migrateCostG(T)= T.remainSuperStep *( T.runTime -avgRunT i me) -T.migrateCost ;

其中,G(T)为确定的准迁移任务的迁移收益值,T.remainSuperStep为确定的准迁移任务的剩余超步运行时间信息,T.runTime为确定的准迁移任务的运行时间信息,avgRunTime为所述工作节点在所述迭代周期内非准迁移任务的平均运行时间信息,T.migrateCost为确定的准迁移任务的迁移代价时间信息,等于数据加载的时间长度与消息读或者写的时间长度之和。Among them, G(T) is the migration benefit value of the determined quasi-migration task, T.remainSuperStep is the remaining superstep running time information of the determined quasi-migration task, T.runTime is the running time information of the determined quasi-migration task, and avgRunTime is The average running time information of the non-quasi-migration task of the working node in the iteration cycle, T. migrateCost is the migration cost time information of the determined quasi-migration task, which is equal to the time length of data loading and the time length of message reading or writing and.

所述迁移模块13,具体用于判断所述工作节点在连续N个迭代周期内是否存在确定为待迁移任务的次数超过设定数值的待迁移任务;The migration module 13 is specifically used to judge whether there are tasks to be migrated in the working node that are determined to be migration tasks whose number of times exceeds a set value within N consecutive iteration cycles;

当存在确定为待迁移任务的次数超过设定数值的待迁移任务时,判定确定为待迁移任务的次数超过设定数值的待迁移任务为迁移任务,并将判定的所述迁移任务迁移至除了所述工作节点之外的任务处理量小于设定阈值的工作节点中。When there is a task to be migrated whose number of times is determined to be a task to be migrated exceeds a set value, it is determined that a task to be migrated whose number of times is determined to be a task to be migrated exceeds a set value is a migration task, and the determined migration task is migrated to a location other than In the working nodes other than the working nodes, the task processing volume is smaller than the set threshold.

其中,N为自然数,且小于待迁移任务所在工作点的总迭代周期数。Among them, N is a natural number and is less than the total number of iteration cycles of the working point where the task to be migrated is located.

需要说明的是,本发明实施例所述的处理设备可以是通过硬件方式实现的,也可以是通过软件方式实现的,这里不做限定。It should be noted that the processing device described in the embodiment of the present invention may be implemented by means of hardware or software, which is not limited here.

本领域的技术人员应明白,本发明的实施例可提供为方法、装置(设备)、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention may be provided as methods, devices (devices), or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本发明是参照根据本发明实施例的方法、装置(设备)和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices) and computer program products according to embodiments of the invention. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。While preferred embodiments of the invention have been described, additional changes and modifications to these embodiments can be made by those skilled in the art once the basic inventive concept is appreciated. Therefore, it is intended that the appended claims be construed to cover the preferred embodiment as well as all changes and modifications which fall within the scope of the invention.

显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalent technologies, the present invention also intends to include these modifications and variations.

Claims (8)

1. a kind of processing method of load balancing, which is characterized in that including:
Obtain the run time information of each task in an iteration cycle of a working node;
The analysis strategy of the run time information of each task and task to be migrated according to acquisition, determines the work Make to be migrated task of the node in the iteration cycle;
When the number for being determined as task to be migrated is more than setting numerical value, by the task immigration to be migrated in addition to the work Task treating capacity except node is less than the working node of given threshold;
When the number for being determined as task to be migrated is more than setting numerical value, by the task immigration to be migrated in addition to the work Task treating capacity except node is less than the working node of given threshold, including:
Judge that the working node whether there is the number for being determined as task to be migrated more than setting in continuous N number of iteration cycle The task to be migrated of numerical value;
When there is the task to be migrated that the number for being determined as task to be migrated is more than setting numerical value, judgement is determined as to be migrated The number of business is more than to set the task to be migrated of numerical value as migration task, and by the migration task immigration of judgement in addition to institute The task treating capacity except working node is stated to be less than in the working node of given threshold.
2. processing method as described in claim 1, which is characterized in that the run time of each task according to acquisition The analysis strategy of information and task to be migrated determines to be migrated task of the working node in the iteration cycle, packet It includes:
The run time information of each task according to acquisition carries out the run time information of each task Sequence;
Using obtained sequencing information, quasi- migration task of the working node in the iteration cycle is determined;And
Using Cost Model is migrated, the migration financial value of determining quasi- migration task is calculated;
It is more than the quasi- migration task of setting threshold value for migration financial value, when the standard that migration financial value is more than setting threshold value is moved When task treating capacity of the working node in setting time where shifting task is more than given threshold, determine that migration financial value is more than The task to be migrated that the quasi- migration task for setting threshold value is the working node in the iteration cycle.
3. processing method as claimed in claim 2, which is characterized in that using obtained sequencing information, determine the work section Quasi- migration task of the point in the iteration cycle, including:
For each run time information in sequencing information, run time information is set determining that run time information is more than When, it determines to be more than set that the corresponding task of run time information be the working node in the iteration cycle accurate and migrates times Business;
Wherein, the setting run time information determines in the following manner:
T=T2+(T2-T1)*1.5;T is sets run time information, T2According to obtained sequencing information, determine in sequencing information Corresponding temporal information at 3/4ths;T1According to obtained sequencing information, determine in sequencing information at a quarter position Corresponding run time information.
4. processing method as claimed in claim 2 or claim 3, which is characterized in that using Cost Model is migrated, calculate determining standard and move The migration financial value of shifting task, including:
The migration financial value of determining quasi- migration task is calculated in the following manner:
G (T)=T.remainSuperStep*(T.runTime-avgRunTime)-T.migrateCost
Wherein, G (T) is the migration financial value of determining quasi- migration task, T.remainSuperStepFor determining quasi- migration task Remaining superledge run time information, T.runTimeFor the run time information of determining quasi- migration task, avgRunTime is described The average operating time information of working node non-quasi- migration task in the iteration cycle, T.migrateCostIt is moved for determining standard The migration cost temporal information of shifting task, equal to time span and the sum of the message reading or the time span write of data loading.
5. a kind of processing equipment of load balancing, which is characterized in that including:
Acquisition module, for obtaining the run time information of each task in an iteration cycle of a working node;
Determining module, for the analysis plan of the run time information of each task and task to be migrated according to acquisition Slightly, to be migrated task of the working node in the iteration cycle is determined;
Transferring module, for when be determined as task to be migrated number be more than setting numerical value when, by the task immigration to be migrated It is less than the working node of given threshold to the task treating capacity other than the working node;
The transferring module is determined as treating specifically for judging that the working node whether there is in continuous N number of iteration cycle The number of migration task is more than the task to be migrated of setting numerical value;
When there is the task to be migrated that the number for being determined as task to be migrated is more than setting numerical value, judgement is determined as to be migrated The number of business is more than to set the task to be migrated of numerical value as migration task, and by the migration task immigration of judgement in addition to institute The task treating capacity except working node is stated to be less than in the working node of given threshold.
6. processing equipment as claimed in claim 5, which is characterized in that
The determining module, specifically for the run time information of each task according to acquisition, will it is described each The run time information of task is ranked up;Using obtained sequencing information, determine the working node in the iteration cycle Interior quasi- migration task;And using Cost Model is migrated, calculate the migration financial value of determining quasi- migration task;Migration is received Benefit value is more than the quasi- migration task of setting threshold value, the work where migration financial value is more than the quasi- migration task of setting threshold value When making task treating capacity of the node in setting time more than given threshold, determine that migration financial value is more than the standard of setting threshold value Migration task is to be migrated task of the working node in the iteration cycle.
7. processing equipment as claimed in claim 6, which is characterized in that
The determining module specifically for each the run time information being directed in sequencing information, is determining run time letter When breath is more than setting run time information, determine to be more than that set the corresponding task of run time information be the working node in institute State the quasi- migration task in iteration cycle;
Wherein, the setting run time information determines in the following manner:
T=T2+(T2-T1)*1.5;T is sets run time information, T2According to obtained sequencing information, determine in sequencing information Corresponding temporal information at 3/4ths;T1According to obtained sequencing information, determine in sequencing information at a quarter position Corresponding run time information.
8. processing equipment as claimed in claims 6 or 7, which is characterized in that
The determining module, specifically for calculating the migration financial value of determining quasi- migration task in the following manner:
G (T)=T.remainSuperStep*(T.runTime-avgRunTime)-T.migrateCost
Wherein, G (T) is the migration financial value of determining quasi- migration task, T.remainSuperStepFor determining quasi- migration task Remaining superledge run time information, T.runTimeFor the run time information of determining quasi- migration task, avgRunTime is described The average operating time information of working node non-quasi- migration task in the iteration cycle, T.migrateCostIt is moved for determining standard The migration cost temporal information of shifting task, equal to time span and the sum of the message reading or the time span write of data loading.
CN201410108066.2A 2014-03-21 2014-03-21 The processing method and equipment of a kind of load balancing Active CN104935523B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410108066.2A CN104935523B (en) 2014-03-21 2014-03-21 The processing method and equipment of a kind of load balancing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410108066.2A CN104935523B (en) 2014-03-21 2014-03-21 The processing method and equipment of a kind of load balancing

Publications (2)

Publication Number Publication Date
CN104935523A CN104935523A (en) 2015-09-23
CN104935523B true CN104935523B (en) 2018-06-15

Family

ID=54122497

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410108066.2A Active CN104935523B (en) 2014-03-21 2014-03-21 The processing method and equipment of a kind of load balancing

Country Status (1)

Country Link
CN (1) CN104935523B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108073446B (en) * 2016-11-10 2020-11-17 华为技术有限公司 Timeout prejudging method and device
US10628233B2 (en) * 2016-12-30 2020-04-21 Samsung Electronics Co., Ltd. Rack-level scheduling for reducing the long tail latency using high performance SSDS
CN107346262B (en) * 2017-06-06 2020-12-15 华为技术有限公司 A method and controller for task migration
CN107517273B (en) * 2017-09-21 2020-10-02 郑州云海信息技术有限公司 Data migration method, system, computer readable storage medium and server
CN111104209B (en) * 2019-11-25 2023-07-11 华为技术有限公司 A method for processing tasks and related equipment
CN111176814B (en) * 2019-12-29 2022-06-17 山东英信计算机技术有限公司 A task execution method and related device
CN114708480B (en) * 2022-03-04 2024-11-01 深圳海星智驾科技有限公司 Load balancing control method and device and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101446910A (en) * 2008-12-08 2009-06-03 哈尔滨工程大学 AEDF task scheduling method based on SMP
CN101986272A (en) * 2010-11-05 2011-03-16 北京大学 Task scheduling method under cloud computing environment
CN103336808A (en) * 2013-06-25 2013-10-02 中国科学院信息工程研究所 System and method for real-time graph data processing based on BSP (Board Support Package) model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7751327B2 (en) * 2004-02-25 2010-07-06 Nec Corporation Communication processing system, packet processing load balancing device and packet processing load balancing method therefor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101446910A (en) * 2008-12-08 2009-06-03 哈尔滨工程大学 AEDF task scheduling method based on SMP
CN101986272A (en) * 2010-11-05 2011-03-16 北京大学 Task scheduling method under cloud computing environment
CN103336808A (en) * 2013-06-25 2013-10-02 中国科学院信息工程研究所 System and method for real-time graph data processing based on BSP (Board Support Package) model

Also Published As

Publication number Publication date
CN104935523A (en) 2015-09-23

Similar Documents

Publication Publication Date Title
CN104935523B (en) The processing method and equipment of a kind of load balancing
CN104123171B (en) Virtual machine migrating method and system based on NUMA architecture
CN104636187B (en) Dispatching method of virtual machine in NUMA architecture based on load estimation
CN106502791A (en) A kind of method for allocating tasks and device
CN103150259A (en) Memory recovery method and device
CN103369042A (en) Data processing method and data processing device
US20130227244A1 (en) Workload-aware distributed data processing apparatus and method for processing large data based on hardware acceleration
CN105491117B (en) Streaming diagram data processing system and method towards real-time data analysis
CN104216784A (en) Hotspot balance control method and related device
CN103918239A (en) Load balancing method, device, system and computer readable medium
Huang et al. Novel heuristic speculative execution strategies in heterogeneous distributed environments
Hu et al. Improved heuristic job scheduling method to enhance throughput for big data analytics
CN107704310A (en) A kind of method, apparatus and equipment for realizing container cluster management
CN117971906A (en) A multi-card collaborative database query method, device, equipment and storage medium
CN105893155A (en) Virtual machine load balancing control method and device
CN104866375B (en) A kind of method and device for migrating virtual machine
CN103106112A (en) Method and device based on maximum load and used for load balancing scheduling
Fan et al. A heterogeneity-aware data distribution and rebalance method in Hadoop cluster
CN109840151B (en) Load balancing method and device for multi-core processor
CN109766190A (en) Cloud resource dispatching method, device, equipment and storage medium
CN106325812A (en) Processing method and device for multiplication and accumulation operation
CN116302307A (en) A multi-virtual machine migration method, device, equipment and medium
CN115309502A (en) A container scheduling method and device
CN108429704A (en) A node resource allocation method and device
CN103713953A (en) Device and method for transferring data in memory

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant