Haruna Ahmed Abba, Nordin B. Zakaria, Syed Nasir Mehmood Shah, and Anindya.J.Pal
High Performance Computing Service Center (HPCC),
Universiti Teknologi PETRONAS, Seri Iskandar, 31750 Tronoh, Perak, Malaysia
ahmadydee@gmail.com, nordinzakaria@petronas.com.my, nasirsyed.utp@gmail.com, anindyajp@gmail.com
!
$
"#
%!
!
#
&
'
#
(
&
#
)
'
+
'
"
*
*
#
#
$
'
%
#
# &
#
*
#
!
#
,
-
!
'
!
I.
!
,!
-./ 0 12/-0.
The real word called “grid” appeared to be initiated in
the middle of 1990s for the purpose of
representing a
proposed distributed computing infrastructure for highly
developed science and engineering projects [1]. The
objective of grid computing is to combine the computing
power involved, with widely distributed resources, as well
as to deliver non 9trivial services to users [2]. Furthermore
Grid Computing stands out as the principle, occurring for
several years of time, simply by concentrating on virtual
organizations [3], to be able to share large9scale resources,
innovating applications and perhaps acquiring high9
performance orientation. The continual growth and
development of communications, in relation to high quality
as well as availability, is escalating the interest on grid
computing paradigm [4], through which computing
resources geographically distributed, can end up being
logically coupled with each other operating as a
computational unit. In Grid [5] approach is often a new
generation technologies put together physical resources
along with applications which provide extremely more
efficient solutions to sophisticated problems (e.g., scientific,
engineering as well as business). There are three main
levels of scheduling on a grid. Phase one is resource
discovery, which in turn generates a record involving
potential resources. Level two consists of accumulating
information as regards to those resources as well as selecting
the most effective set to correspond to the application
requirements. During the last level the task will be executed,
consisting of file staging along with cleanup. Typically
scheduling challenges tend to be NP9hard [6] problems. The
consideration in scheduling is always to accomplish high
performance i n g r i d computing [7]. In recent years,
lot of researchers have been offered in different types
of approaches for dynamic job scheduling in different
notions. But ours is based on the concept of software
project management, which consist of modules and modules
are divided in task referred as jobs. However, the execution
of job is based on expected completion time or completion
time of job execution.
In this paper, we propose a new scheduling algorithm,
Prioritized Deadline based scheduling algorithm (PDSA),
which has considered the job deadline as the prime attribute
for job execution. Grid users are highly interested to execute
their jobs in the timely manner under the deadline
constraints. Most of the scheduling algorithms have not
considered deadline perspective for job execution. PDSA has
been proposed to meet the deadline constraints as per the
users requirement. Moreover, the system perspective (i.e.,
minimize the average turnaround time) has also been
considered in t h e design of this algorithm. An
extensive performance comparison is presented using
synthetic workload traces to evaluate the efficiency and
robustness of grid scheduling algorithms.
The rest of this paper is organized as follows. Section 2
gives an overview on previous s researches in resource
scheduling. Section 3 discusses the system design and
implementation details of our grid resource scheduling
respectively. Section 4 describes experimental results and
section 5 concludes the paper.
II.
3 /
2(
In recent years, many researches have been offered in
different types of approaches for dynamic job scheduling in
different notions.
In a related development, author [8] used Fuzzy C9
Mean and Genetic Algorithms for dynamic job scheduling.
His model presents a method of the jobs classifications
based primarily on Fuzzy C9Mean algorithm as well as
mapping the jobs to the appropriate resources primarily
based mainly on Genetic algorithm. However, this approach
separates workload data to three classifications based on jobs
run9time historical data which proves the optimism, but
submission time of job should be considered, because it will
be more efficient when user know the time of job
submission as well as its finishing time, in other to avoid
time delay of execution. In related work by author [10], a
static job scheduling algorithm through the use of Fuzzy C9
Mean along with Genetic algorithms appears to have been
applied. The following model presents the strategies of
allocating jobs to distinct nodes, which has been being
developed by using Fuzzy C9Mean algorithm for prediction
the characteristics of jobs in which run in Grid environment
and Genetic algorithm for jobs allocated to large sharing of
resources. Similarly, author [11] approach presented the
results of the simulation of Grid environment with regards to
jobs allocation to distinct nodes. The results prove the model
by using Fuzzy c9mean clustering approach for
predicting the characterization of jobs as well as
optimization involving jobs scheduling in Grid environment.
This kind of prediction and optimization engine provided
jobs scheduling base upon historical information. In another
study, author [12] presented a fault9tolerant scheduling
framework through DIOGENES (”DIstributed Optimal
GENEetic algorithm with respect to grid application
Scheduling”), of which is mapped on the actual architecture
of MedioGRID, a real9time satellite image processing
system operating within a Grid environment. The proposed
solution provides a fault tolerant mechanism of mapping
the image processing applications, on the available
resources in MedioGRID clusters and uniform access. While
[13] improved particle swarm optimization (PSO) algorithm
with discrete coding rule for grid scheduling with regard to
the optimization of grid task scheduling problems, as well
as optimizes the grid resources allocation. Similarly, [14]
implemented a new approach based on particle swarm
optimization algorithm in order to resolve a task scheduling
challenges in grid. The newl y algorithm is generating an
optimal schedule to complete a tasks process within a
minimum time frame as well as utilizing the resources in an
efficient way. In related work by 15] proposed a novel
approach based on hybrid PSO and GELS (GPSO)
algorithm in order to resolve grid scheduling challenge in
order to attenuate makespan as well as missed task.
Furthermore, [16] attempts to present evaluation of
recommended GA based scheduling against existing
traditional algorithms. The simulation results evidently show
how the proposed approach can discover optimized solution.
In the work of [9], a cost9based workflow scheduling
algorithm was presented in order to minimize the cost of
execution while reaching the deadline. A Markov Decision
Process approach has been utilized in order to schedule step
by step workflow task execution, such that it could possibly
find the optimal path among services to execute tasks as
well as transfer input or output data. However, to be more
efficient, some additional priorities need to be considered,
like maximum turnaround time and time delayed when it
comes to the rescheduling of unexecuted job. While [17],
aims at dealing with the fairness problem by dropping the
service time frame error. The algorithm assigns to each task
sufficient computational power to complete it within its
deadline. The resources that each user gets are proportional
to the user’s weight or perhaps a share. Here, scheduling of
tasks is based on an error called the Service time error which
fairness among users. However, it will be more optimize if
priority is given based on minimum time of execution of job
not based on individual demand. In another work by author
[18], a new job scheduling policy was determined by
backfilling (JR9backfilling). The main goals of these policies
was to decrease the workload execution time frame, job
waiting time, job response time, and average bounded
slowdown and to successfully optimize the resource
utilization. While [19] approach reduce processing time
frame and utilize grid resource adequately. The primary goal
is to maximize the resource utilization and reduce processing
time frame of jobs. Grid resource selection approach is based
on Max Heap Tree (MHT) of which best suits regarding
large scale application and root node of MHT is selected for
job submission.
Project management is the well known area of
operation research. H e r e w e a r e p r o p o s i n g a n e w
Prioritized Deadline based scheduling algorithm (PDSA)
using project management technique. For PDSA is the true
application of project management in grid computing.
III.
3-.
0 2(
We simulate some of the traditional algorithms such as
Earliest Deadline First (EDF) Scheduling Algorithm and
Round Robin Scheduling Algorithm (RR) as baseline to
compare the performance of our newly developed Prioritized
Deadline based scheduling algorithm (PDSA) and analyze
the results.
A. Round Robin Scheduling Algorithm (RR): in this
prospective ready queue is maintained as a FIFO
queue. A process control block (PCB) of a process
submitted to the system is linked to the tail of the
queue. The algorithm dispatches processes from the
head of the ready queue for execution by the CPU.
Processes being executed is preempted based on a
time quantum, which is a system defined variable. A
preempted process’s PCB is linked to the tail of the
ready queue. When a process has completed its task,
i.e. before the expiry of the time quantum, it
terminates and is deleted from the system. The next
process is then dispatched from the head of the ready
queue.
B. Earliest Deadline First (EDF) Scheduling algorithm is
the simplest scheduling and famous algorithm that the
earlier the deadline is, the higher the priority is;
Processes are dispatched based on minimum deadline
on the ready queue. When a process has completed
its task it will be terminated and then the next job
with minimum deadline will be dispatched from the
ready queue. EDF is not efficient, because if two
tasks have the same absolute deadlines, it chose one
of the two at random (ties can be broken arbitrarily),
which result to less fairness between jobs.
IV.
0 0
0
3 0 -/(4
2(
13-.
Prioritized Deadline based scheduling algorithm
(PDSA): This algorithm executes the process with the
closest deadline time delay. Based on our algorithm the
allocation is carried out for a single processor based
on the deadline criteria dependent on minimum time delay
of job execution, turnaround time and maximum tardiness.
Basic definition of the aforementioned criteria:
Let us assume Ji : ith Job; n: the number of jobs; Ti : arrival
time of job i; di: deadline of job i; αi: burst time of job i; Ci:
Job completion time of job i; TTRi: turnaround time of job i;
TTDi: time delay of job i; TTRDi: tardiness of job i; TMax_TRD:
maximum tardiness; S9list: Sorted list;
V.
13/
.
- 21 -0.
Our simulator has been used to carry out extensive
I.
Time delay: Referred to the time difference experimentation using the Windows 7 operating system on
an Intel Core4 Duo. We used Grid Workloads Archive LCG
between burst time and deadline time.
data traces provided by provided by the e9Science Group of
Time delay, TTDi
9 α ……………………………………….(1)
HEP at Imperial College London for process set generation
II.
Turnaround time: Referred to the total time taken in our experiments. The simulations of the algorithms have
between the submission of job for execution and the generated useful data that has been analyzed. To check the
performance of the proposed algorithms, i.e. PDSA
return of the completed result.
Turnaround time TTRi = Ci 9 Ti ...............................(2) scheduling algorithm, EDF scheduling algorithm and RR
scheduling algorithm; we have taken this burst time values
in 10, 100 and 1000 showing the heterogeneous demands of
Average Turnaround time,
user’s jobs, each with different characteristics, and ran them
through the simulator. Each process is specified by its CPU
burst length, arrival time and priority number. Each
= =1
…………………………………………….(3)
_
process set has been given a time quantum for simulation.
Performance metrics for the CPU scheduling algorithms are
III.
Maximum tardiness: Referred to the maximum time based on the following factors 9 Average Turnaround Time,
and Maximum tardiness.
delay between turnaround time and deadline time.
Tardiness, TTRDi = di 9 TTRi ....................................(4)
Below is the graph derived from PDSA scheduling
Therefore,
algorithm, EDF scheduling algorithm and RR Scheduling
Maximum Tardiness TMax_TRD = Max(TTRD1, algorithm followed by a discussion. Fig.1 shows graphs of
TTRD2,......TTRDn)…….……………………..…….. (5) the Average Turnaround Times, and Fig.2 Maximum
The algorithm takes the input from users, where as each tardiness, respectively.
∑
job is described by its processID, arrival time, burst time and
deadline, then compute the value of time delay for each job
by sorting out the jobs on the basis of time delay in
ascending order, then selecting the jobs with minimum time
delay for execution. If multiple jobs have same time delay
value then, it will break the tie by selecting a job from job
set on the basis of FCFS, then execute the job at CPU level
for its given burst time (i.e. demand) in non preemptive way.
Compute the value of turnaround time and tardiness for each
job. Compute the average turnaround time each user job and
finally compute the maximum tardiness value for jobs to
identify the maximum time delay in jobs execution.
The compact algorithm is presented below:
Algorithm PDSA:
Input: pool of jobs with processID, arrival time, CPU time
and deadline
BEGIN
For all processes in the pool
Compute the time delay of all processes using
Arrange the job list in ascending order based on
computed time delay(S9list)
if (TTDi = TTDj )
Arrange Ji, Jj based on FCFS
while (S9list is not empty)
do {
Execute the job at CPU level based on demand
Compute the value of Turnaround Time using
Compute the value of Tardiness using
}
Compute average turnaround time using
Compute the value of Maximum tardiness using
End
Fig.1 Average Turnaround Time
Experiment has been performed by varying workload, by
increasing processes from ‘500’ to ‘2000’ in scalable
manner. Result has shown maintained performance under
dynamic environment. Fig.1 presents the comparative
performance analysis of our proposed PDSA with EDF
Scheduling Algorithm and RR Scheduling Algorithm for a
variety of synthetic workload traces. This figure despites that,
PDSA has the best performance as compared to EDF
scheduling algorithm and RR scheduling algorithm under
variable and scalable workload.
conformity together with the established facts and principles
that belong to the science of process scheduling therefore
we believe that the simulator is really a valuable
contribution to the understanding of modern operating
systems.
In future, we will evaluate and propose a
computational scheduling algorithm on grid base on
multiple processors and perform detailed comparative
performance analysis with other scheduling approaches.
27.063
4 ./
We want to express our gratitude to Dr. Nordin B Zakaria
and all HPCC members from Universiti Teknology
PETRONAS for their help during the research.
We thank the HEP e9Science Group at Imperial College
London who provided the LCG data. We also thank Hui Li,
the Parallel Workload Archive and the Grid Workloads
Archive for their contribution in making the data publicly
available.
Fig.2 Maximum Tardiness
Experiment has been performed by varying workload, by
increasing processes from ‘500’ to ‘2000’ in scalable
manner. Again the result has shown maintained performance
under dynamic environment. Fig.2 presents the comparative
performance analysis of our proposed PDSA with EDF
scheduling algorithm and RR scheduling algorithm for a
variety of synthetic workload traces. This figure despites
that, PDSA has the best performance as compared to EDF
scheduling algorithm and RR scheduling algorithm under
variable and scalable workload.
The overall comparative performance analysis has
shown that our proposed PDSA is more efficient than EDF
scheduling algorithm and RR scheduling algorithm for a
variety of synthetic workload traces. This figure despites
that, PDSA has the best performance in terms Average
Turnaround Time and maximum tardiness of EDF scheduling
algorithm and RR scheduling algorithm under variable and
scalable workload.
5-
20.231 -0. . ,1/1
60 7
In this paper, a scheduling algorithm for executing jobs
on grid systems is proposed. Just like real9life scenarios,
we've considered the dynamic arrival of jobs as well as the
deadline requirement of each job to be processed.
Experiment has been performed by varying workload, by
increasing processes from ‘500’ to ‘2000’ in scalable
manner. Result has shown maintained performance under
dynamic environment. Based on the comparative
performance analysis PDSA has shown the best
performance as compared to EDF scheduling algorithm and
RR scheduling algorithm under variable and scalable
workload.
We have developed a new simulator using java
language to facilitate this research. This has been input
simply by extensive experimentation. Various possible input
patterns were experimented with all the CPU scheduling
algorithms. The overall response from the system has been
supervised accordingly. Behavior from the system as well as
the experimentation results, afterwards, has been in
,
.2
[1] Foster, and C. Kesselman, Globus: a metacomputing infrastructure
toolkit,
International
Journal
of
High
Performance
ComputingApplications, Vol. 2, pp. 115–128, 1997.
[2] F. Dong and S. G. Akl, Scheduling algorithm for grid computing:
state of the art and open problems, Technical Report of the Open
Issues in Grid Scheduling Workshop, School of Computing, University
Kingston, Ontario, January, 2006.
[3] Foster I, Kesselman C, Tuecke S. The anatomy of the Grid: Enabling
scalable virtual organizations. Inte rnational Journal of Supercomputer
Applications 2001.
[4] H. Topcuoglu,
S. Hariri,
and
M. Wu, “Performance9
effective and low 9 complexity task scheduling for heterogeneous
computing. IEEE transactions on Parallel and Distributed Systems
13,(3):
2609274, March
2002. Middleware for Grid
Computing”
[5] I. Foster and C. Kesselman, Eds., The Grid 2: Blueprint for a New
Computing Infrastructure. San Francisco, CA: Morgan Kaufmann, 2004.
[6] Blazewicz, J., Domschke, W., and Pesch, E. (1996).
Thejob shop9scheduling problem: Conventional and new solution
techniques.EuropeanJournalofOperationalResearch,93:1930.
[7] R. Buyya,
D. Abramson, and
J. Giddy, “Nimrod/G : An
Architecture for a Resource Management and Scheduling System in a
Global Computational Grid,” Proc. Fourth Int’l Conf. High Performance
Computing in Asia9 Pacific Region, 2000
[8] Siriluck Lorpunmanee, Mohd Noor Md Sap and Abdul Hanan
Abdullah” fuzzy c9mean and genetic algorithms based scheduling for
Independent jobs in computational grid” Jurnal Teknologi Maklumat, Jilid
18, Bil. 2 (December 2006)
[9] Siriluck Lorpunmanee, Mohd Noor Md Sap, Abdul Hanan Abdullah and
Surat Srinoy” A static jobs scheduling for independent jobs in Grid
Environment by using Fuzzy C9Mean and Genetic algorithms ” Proceedings
of the Postgraduate Annual Research Seminar 2006
[10] Siriluck Lorpunmanee, Mohd Noor Md Sap and Abdul Hanan
Abdullah” Optimalisation of a Job Scheduler in the Grid Environment by
Using Fuzzy C9Mean” J. J. Appl. Sci., Vol.9, No. 2 (2007)
[11] Florin Pop, Dacian Tudor, Valentin Cristea and Vladimir Cretu”
Fault9Tolerant Scheduling Framework for MedioGRID System”
EUROCON 2007 The International Conference on “Computer as a Tool” 19
4244908139X/07/$20.00 2007 IEEE , Warsaw, September 9912
[12] BU Yan9ping, ZHOU Wei and YU Jin9shou”An Improved PSO
Algorithm and Its Application to Grid Scheduling Problem” 2008
International Symposium on Computer Science and Computational
Technology, 97890976959349895/08 © 2008 IEEE
[13] Mr. P.Mathiyalagan, U.R.Dhepthie and Dr. S.N.Sivanandam” Grid
scheduling using Enhanced PSO algorithm” P.Mathiyalagan et al. /
(IJCSE) International Journal on Computer Science and Engineering ,Vol.
02, No. 02, 2010, 1409145
[14] Z. Pooranian, A. Harounabadi , M. Shojafar and J. Mirabedini”
Hybrid PSO for Independent Task scheduling in Grid Computing to
Decrease Makespan ” 2011 International Conference on Future
Information Technology IPCSIT, vol.13 (2011) © (2011) IACSIT Press,
Singapore
[15] Mrs.Snehal Kamalapur1 and Mrs.Neeta Deshpande” Efficient CPU
Scheduling: A Genetic Algorithm based Approach” Ad Hoc and
Ubiquitous Computing, 2006. ISAUHC '06. International Symposium On
page(s): 206 – 207, 1942449073191/06/©2006IEEE.
[16] Jia Yu, Rajkumar Buyya and Chen Khong Tham”Cost9based
Scheduling of Scientific Workflow Applications on Utility Grids”
Proceedings of the First International Conferen ce on e9Science and Grid
Computing (e9Science’05) 0976959244896/05 $20.00 © 2005 IEEE
[17] Daphne Lopez, S. V. Kasmir Raja”A Dynamic Error Based Fair
Scheduling Algorithm For A Computational Grid” Journal of Theoretical
and Applied Information Technology © 2005 9 2009 JATIT. All rights
reserved.
[18]Ivan Rodero, Francesc Guim and Julita Corbalan” Evaluation of
Coordinated Grid Scheduling Strategies” High Performance Computing and
Communications, 2009. HPCC '09. 11th IEEE International Conference On
page(s): 1 – 10, 97890976959373892
[19] Raksha Sharma, Vishnu Kant Soni, Manoj Kumar Mishra, Prachet
Bhuyan and Utpal Chandra Dey” An Agent Based Dynamic Resource
Scheduling Model with FCFS9Job Grouping Strategy in Grid Computing”
World Academy of Science, Engineering and Tech