Cloud computing nowadays is playing major role in storage and processing huge tasks with scalabil... more Cloud computing nowadays is playing major role in storage and processing huge tasks with scalability options. Deadline based scheduling is the main focus when we process the tasks using available resources. Private cloud is owned by an organization and resources are free for user whereas public clouds charge users using pay-as-you-go model. When the private cloud is not enough for processing user tasks, resources can be acquired from public cloud. The combination of a public cloud and a private cloud gives rise to hybrid cloud. In hybrid clouds, task scheduling is a complex process as tasks can be allocated resources of either the private cloud or the public cloud. This paper presents an algorithm that decides which resources should be taken on lease from public cloud to complete the workflow execution within deadline and with minimum monetary cost for user. A hybrid scheduling algorithm has been proposed which uses a new concept of sub-deadline for rescheduling and allocation of resources in public cloud. The algorithm helps in finding best resources on public cloud for cost saving and complete workflow execution within deadlines. Three rescheduling policies have been evaluated in this paper. For performance analysis, we have compared the HEFT (Heterogeneous Earliest Finish Time) based hybrid scheduling algorithm with greedy approach and min-min approach. Results have shown that the proposed algorithm optimizes a large amount of cost compared to greedy and min-min approaches and completes all tasks within deadline.
Cloud computing provides on demand resources for compute and storage requirements. Private cloud ... more Cloud computing provides on demand resources for compute and storage requirements. Private cloud is a good option for cost saving for executing workflow applications but when the resources in private cloud are not enough to meet storage and compute requirements of an application then public clouds are the option left. While public clouds charge users on pay-per-use basis, private clouds are owned by users and can be utilized with no charge. When a public cloud and a private cloud is merged, we get a hybrid cloud. In hybrid cloud, task scheduling is a complex process as jobs can be allocated resources either from private cloud or from public cloud. Deadline based scheduling is the main focus in many of the workflow applications. Proposed algorithm does cost optimization by deciding which resources should be taken on lease from public cloud to complete the workflow execution within deadline. In the proposed work, we have developed a level based scheduling algorithm which executes tasks level wise and it uses the concept of sub-deadline which is helpful in finding best resources on public cloud for cost saving and also completes workflow execution within deadlines. Performance analysis and comparison of the proposed algorithm with min-min approach is also presented.
Cloud Computing refers to application and services offered over Internet using pay-as-you-go mode... more Cloud Computing refers to application and services offered over Internet using pay-as-you-go model. The services are offered from data centers all over the world, which jointly are referred to as the “Cloud”. The data centers use scheduling techniques to effectively allocate virtual machines to cloud applications. The cloud applications in area such as business enterprises, bio-informatics and astronomy need workflow processing in which tasks are executed based on data dependencies. The cloud users impose QoS constraints while executing their workflow applications on cloud. The QoS parameters are defined in SLA (Service Level Agreement) document which is signed between cloud user and cloud provider. In this paper, a genetic algorithm has been proposed that schedules workflow applications in unreliable cloud environment and meet user defined QoS constraints. A budget constrained time minimization genetic algorithm has been proposed which reduces the failure rate and makespan of workflow applications. It allocates those resources to workflow application which are reliable and cost of execution is under user budget. The performance of genetic algorithm has been compared with max-min and min-min scheduling algorithms in unreliable cloud environment.
Distributed Computing Systems like Peer-to-Peer, Web and Cloud are becoming popular day by day an... more Distributed Computing Systems like Peer-to-Peer, Web and Cloud are becoming popular day by day and are being used in a wide variety of data intensive applications. Data replication is an important concept which increases the availability and reliability of data in these systems. This paper presents the design and performance analysis of simulated implementation of data replication strategy on a distributed file system using GridSim toolkit. The parameters taken for the performance analysis are aggregate bandwidth, successful execution rate and system byte effective rate. The results indicate that the distributed file system making use of data replication strategy performs better, with respect to the parameters mentioned above, than the distributed file system which is not making use of data replication strategy. The integration of data replication scheme with distributed file system can greatly improve the availability and reliability of data.
This paper presents the design and performance analysis of distributed implementation of Apriori ... more This paper presents the design and performance analysis of distributed implementation of Apriori algorithm in grid environment. Apriori algorithm is very important algorithm in data mining discipline that enables organizations to mine large amount of historical data that they gather over period of time and discover hidden patterns in that data. Data mining techniques enable organizations to analyze market trends and user behavior. If the data set to be mined is very large then varying the basic algorithm for execution in a distributed environment makes sense because distributed technologies generally offer performance benefits. Grids have gained wide popularity in executing a task in distributed fashion and offer performance benefits. So in this paper we have made an attempt to implement distributed version of basic Apriori algorithm in a grid environment. The Grid environment has been constructed using Globus® Toolkit. Experimental results show that distributed version offers performance benefits over basic version of Apriori algorithm and hence is a good implementation choice if the data to be mined is really large and distributed.
Privacy is becoming an important concern in service oriented environments like grid and web. Serv... more Privacy is becoming an important concern in service oriented environments like grid and web. Service providers and service requesters, both have complex set of privacy policies to better protect their interests. Both the parties need surety that the facts/information they reveal ...
Cloud computing nowadays is playing major role in storage and processing huge tasks with scalabil... more Cloud computing nowadays is playing major role in storage and processing huge tasks with scalability options. Deadline based scheduling is the main focus when we process the tasks using available resources. Private cloud is owned by an organization and resources are free for user whereas public clouds charge users using pay-as-you-go model. When the private cloud is not enough for processing user tasks, resources can be acquired from public cloud. The combination of a public cloud and a private cloud gives rise to hybrid cloud. In hybrid clouds, task scheduling is a complex process as tasks can be allocated resources of either the private cloud or the public cloud. This paper presents an algorithm that decides which resources should be taken on lease from public cloud to complete the workflow execution within deadline and with minimum monetary cost for user. A hybrid scheduling algorithm has been proposed which uses a new concept of sub-deadline for rescheduling and allocation of resources in public cloud. The algorithm helps in finding best resources on public cloud for cost saving and complete workflow execution within deadlines. Three rescheduling policies have been evaluated in this paper. For performance analysis, we have compared the HEFT (Heterogeneous Earliest Finish Time) based hybrid scheduling algorithm with greedy approach and min-min approach. Results have shown that the proposed algorithm optimizes a large amount of cost compared to greedy and min-min approaches and completes all tasks within deadline.
Cloud computing provides on demand resources for compute and storage requirements. Private cloud ... more Cloud computing provides on demand resources for compute and storage requirements. Private cloud is a good option for cost saving for executing workflow applications but when the resources in private cloud are not enough to meet storage and compute requirements of an application then public clouds are the option left. While public clouds charge users on pay-per-use basis, private clouds are owned by users and can be utilized with no charge. When a public cloud and a private cloud is merged, we get a hybrid cloud. In hybrid cloud, task scheduling is a complex process as jobs can be allocated resources either from private cloud or from public cloud. Deadline based scheduling is the main focus in many of the workflow applications. Proposed algorithm does cost optimization by deciding which resources should be taken on lease from public cloud to complete the workflow execution within deadline. In the proposed work, we have developed a level based scheduling algorithm which executes tasks level wise and it uses the concept of sub-deadline which is helpful in finding best resources on public cloud for cost saving and also completes workflow execution within deadlines. Performance analysis and comparison of the proposed algorithm with min-min approach is also presented.
Cloud Computing refers to application and services offered over Internet using pay-as-you-go mode... more Cloud Computing refers to application and services offered over Internet using pay-as-you-go model. The services are offered from data centers all over the world, which jointly are referred to as the “Cloud”. The data centers use scheduling techniques to effectively allocate virtual machines to cloud applications. The cloud applications in area such as business enterprises, bio-informatics and astronomy need workflow processing in which tasks are executed based on data dependencies. The cloud users impose QoS constraints while executing their workflow applications on cloud. The QoS parameters are defined in SLA (Service Level Agreement) document which is signed between cloud user and cloud provider. In this paper, a genetic algorithm has been proposed that schedules workflow applications in unreliable cloud environment and meet user defined QoS constraints. A budget constrained time minimization genetic algorithm has been proposed which reduces the failure rate and makespan of workflow applications. It allocates those resources to workflow application which are reliable and cost of execution is under user budget. The performance of genetic algorithm has been compared with max-min and min-min scheduling algorithms in unreliable cloud environment.
Distributed Computing Systems like Peer-to-Peer, Web and Cloud are becoming popular day by day an... more Distributed Computing Systems like Peer-to-Peer, Web and Cloud are becoming popular day by day and are being used in a wide variety of data intensive applications. Data replication is an important concept which increases the availability and reliability of data in these systems. This paper presents the design and performance analysis of simulated implementation of data replication strategy on a distributed file system using GridSim toolkit. The parameters taken for the performance analysis are aggregate bandwidth, successful execution rate and system byte effective rate. The results indicate that the distributed file system making use of data replication strategy performs better, with respect to the parameters mentioned above, than the distributed file system which is not making use of data replication strategy. The integration of data replication scheme with distributed file system can greatly improve the availability and reliability of data.
This paper presents the design and performance analysis of distributed implementation of Apriori ... more This paper presents the design and performance analysis of distributed implementation of Apriori algorithm in grid environment. Apriori algorithm is very important algorithm in data mining discipline that enables organizations to mine large amount of historical data that they gather over period of time and discover hidden patterns in that data. Data mining techniques enable organizations to analyze market trends and user behavior. If the data set to be mined is very large then varying the basic algorithm for execution in a distributed environment makes sense because distributed technologies generally offer performance benefits. Grids have gained wide popularity in executing a task in distributed fashion and offer performance benefits. So in this paper we have made an attempt to implement distributed version of basic Apriori algorithm in a grid environment. The Grid environment has been constructed using Globus® Toolkit. Experimental results show that distributed version offers performance benefits over basic version of Apriori algorithm and hence is a good implementation choice if the data to be mined is really large and distributed.
Privacy is becoming an important concern in service oriented environments like grid and web. Serv... more Privacy is becoming an important concern in service oriented environments like grid and web. Service providers and service requesters, both have complex set of privacy policies to better protect their interests. Both the parties need surety that the facts/information they reveal ...
Uploads
Papers