WO2014073949A1

WO2014073949A1 - A system and method for virtual machine reservation for delay sensitive service applications

Info

Publication number: WO2014073949A1
Application number: PCT/MY2013/000191
Authority: WO
Inventors: Ping LIM BOON; Karuppiah ETTIKAN KANDASAMY; Kit CHONG POH
Original assignee: Mimos Berhad
Priority date: 2012-11-12
Filing date: 2013-11-11
Publication date: 2014-05-15

Abstract

The invention provides a system (100) including a servicing module (110) that is adapted to receive and manage a service request (120) from a client network. A scheduling module (130) is in communication with the servicing module (110) and facilitates the provision and deployment of virtual machines in order to fulfil the service requests (120). A prediction module (140) is provided and is configured to predict service latency of unmeasured virtual machine resources and communicate estimated service delay to the servicing module (110). A measurement module (150) is also provided. This is configured to measure service latency of virtual machines which emulate the cloud computing service. The scheduling module (130) is configured to provide and deploys said virtual machines to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines.

Description

A SYSTEM AND METHOD FOR VIRTUAL MACHINE RESERVATION FOR DELAY

SENSITIVE SERVICE APPLICATIONS

FIELD OF INVENTION

The present invention relates to a system and method for virtual machine reservation for delay sensitive service applications. In particular, the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.

BACKGROUND ART

Current systems and methods implement reservation of virtual machines solely on cloud resource availability. Generally, virtual machine resources are maximised to assure service response time which results in ongoing wasted resources. More particularly, in current systems cloud software as a service (SaaS) providers need to place maximum cloud infrastructure in advance to ensure smooth service operation. SaaS providers are therefore generally required to provide expected virtual machine requests to ensure infrastructures are deployed and placed on standby for service. Lease requests are generally made on the basis of number of hosts, number of CPUs, amount of memory required and time. For example, a client may advise they need 10 nodes, each with 2 CPUs, 4GB of memory, from 2pm to 4pm every day. Lease request method is unsuitable for cloud service providers that provide delay sensitive services. For example, a request for bond price predictions may require intensive computation and the results may be required by a customer anywhere and at any time within seconds of the request. As such, the SaaS provider may not be able to predict the requirement of computer resources as the SaaS provided may not know when, where or how much computing resources are needed. In order to fulfil service level agreements, maximum numbers of virtual machines need to be deployed at dispersed locations which generally results in wasted resources. In particular, this may result in virtual machines standing idle when no service request is received. United States Patent Publication No. 2011/0231899 describes a system that provides a cloud-computing service from a cloud-computing environment comprising a plurality of cloud-computing resources. In certain embodiments, the system comprises a management module configured to manage a cloud-computing resource of the plurality of cloud-computing resources as a cloud-computing service. Generally, the cloud-computing service performs a computer workload. The system also comprises an adapter configured to connect to the cloud-computing resource to the system and translate a management instruction received from the management module into a proprietary cloud application program interface call for the cloud-computing resource. A cloud service bus is provided that is configured to route the management instruction from the management module to the adapter and a consumption module is provided that is configured to allow a user to subscribe the cloud-computing service. Finally, a planning module is provided that is configured to plan the cloud-computing service and a build module is provided that is configured to build the cloud-computing service from the cloud-computing resource and publish the cloud-computing service to the consumption module.

This publication exemplifies a system that involves reactive service provisioning in which the allocation of virtual machines is based on the computational workload assigned, and in which planning, management and spawning of virtual machines is based on service requests. The scheduling policy described is not application aware, but focuses on virtual machine allocation based on hardware resource availability.

United States Patent Publication No. 2008/0304421 describes a prediction tree for estimating values of a network performance measure. Leaf nodes of the prediction tree are associated with networked computing devices and interior nodes are not necessarily representative of physical network connections. Values are assigned to edges in the prediction tree and the network performance measure relative to two computing devices represented by two nodes of the tree is estimated by aggregating the values assigned to the edges in the path in the prediction tree joining the two edges. Mechanisms for adding nodes representing computing devices to the prediction tree, for identifying a closest node representing a computing device in the prediction tree, for identifying a cluster of devices represented by nodes of the tree, and for rebalancing the prediction tree are provided.

This publication exemplifies systems based on the well-known Euclidean Steiner Tree Model in combinatorial optimisation. The input as exemplified in this publication relates to inter-nodal network performance measurements (i.e. network latency only). Optimisation of the prediction tree does not refer to the mechanism behind network node selection. The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one exemplary technology area where some embodiments described herein may be practiced.

SUMMARY OF INVENTION

One aspect of the present invention provides a system for virtual machine reservation for delay sensitive service applications. The system comprising at least one servicing module configured to manage cloud computing service requests; at least one scheduling module configured to provide and deploy virtual machines for cloud computing services to fulfil the requests; at least one prediction module configured to predict service latency of unmeasured virtual machine resources; and at least one measurement module configured to trigger measurement of service latency of virtual machines which emulate the cloud computing service, characterised in that the scheduling module is configured to provide and deploys the virtual machines to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines. The said scheduling module further comprises a scheduler configured to deploy virtual machines for service emulation, to reserve virtual machines according to a given policy defined by policy making module, and to shutdown virtual machines for resource optimisation.

In another aspect the invention provides a system wherein the servicing module further comprises a service request handler configured to input a service configuration of a service type and/or a range of tolerable service response times for the service request; a planning module configured to identify a set of cloud computing services to be deployed on virtual machines for service latency computation; and a policy making module configured to receive the predicted service latency and the measured service latency, estimate total service response time and define policy to optimise virtual machine resources to be reserved. In yet another aspect of the invention there is provided a system wherein the planning module further comprises a task categorisation module configured to classify tasks required to satisfy the service request; and a task provisioning module configured to identify at least one virtual machine available and required to satisfy the service by forming a service performance zone based on the pre-defined service response time.

In still another aspect of the invention there is provided a system wherein the policy making module further comprises at least one appointed host a plurality of virtual machines to be reserved at the appointed host; a plurality of CPU resources to be reserved at the appointed host; and a plurality of memory resources to be reserved at the appointed host.

In a further aspect of the invention there is provided a system wherein the prediction module further comprises an estimation module configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s); and a tree construction module configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources.

In another aspect of the invention there is provided a system wherein the measurement module comprises a controller module configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to the prediction module; and a repository handler module configured to retrieve historical service latency data for selected virtual machines and feedback to the prediction module.

In another aspect the invention provides a method for virtual machine reservation for delay sensitive service applications comprising receiving a service request from a client network (410);providing cloud computing resources requested (420); instantiating virtual machines service (430); emulating the service request and collecting machine service latency (440); predicting service latency of unmeasured virtual machines (450);forming at least one servicing performance zone based on the service latency of virtual machines (460); and determining virtual machine resources to be reserved (470), characterised in that the virtual machines are reserved to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines further comprises steps of. identifying number of virtual machines available in each servicing performance zone (471); determining service response times for each servicing performance zone (472); calculating the virtual resource required to fulfil the pre-defined service response time (473); and deploying additional virtual machines if the pre-defined service response time is not fulfilled by the service response time within the servicing performance zone (474); and shutting down virtual machine (s) if the service response time within the servicing performance zone outperforms the pre-defined service response time (475).

In a further aspect of the invention there is provided a method wherein providing cloud computing resources comprises identifying service type and/or required range of service response time (421); determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422); receiving predicted service latency and measured service latency information (423); estimating total service response time (424); and providing a virtual machine resource to be reserved (425).

In still a further aspect of the invention there is provided a method wherein emulating the service request comprises triggering a set of cloud computing services on at least one selected virtual machine (441); measuring service latency from the selected virtual machine(s) (442); and feedback of the service latency (443).

In yet another aspect of the invention there is provided a method wherein predicting service latency of unmeasured virtual machines comprises selecting at least two virtual machines (451); receiving at least one service latency measurement (452); constructing at least one prediction tree (453); and predicting service latency of unmeasured virtual machines (454).

In another aspect of the invention there is provided a method wherein forming at least one servicing performance zone comprises retrieving service latency information (461); identifying a range of service response time (462); identifying response intervals for servicing performance zone(s) (463); and forming the performance service zone(s) on the prediction tree based on the range of service response time (464). In yet another aspect the invention provides a method wherein calculating the virtual resource required to fulfil the pre-defined service response time comprises determining at least one appointed host (476); determining the number of virtual machines to be reserved at the appointed host (477); determining the number of CPU resources to be reserved at the appointed host (478); and determining the number of memory resources to be reserved at the appointed host (479).

The present invention consists of features and a combination of parts hereinafter fully described and illustrated in the accompanying drawings, it being understood that various changes in the details may be made without departing from the scope of the invention or sacrificing any of the advantages of the present invention.

BRIEF DESCRIPTION OF ACCOMPANYING DRAWINGS

To further clarify various aspects of some embodiments of the present invention, a more particular description of the invention will be rendered by references to specific embodiments thereof, which are illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail through the accompanying drawings in which: FIG. 1.0 illustrates the system of an embodiment of the invention.

FIG. 2.0 illustrates the servicing module of the system of Figure 1.0 in more detail.

FIG. 3.0 illustrates the protection module, measurement module and scheduling module of the system of Figure 1.0 in more detail.

FIG. 4.0 illustrates a flow diagram of the method of an embodiment of the invention.

FIG. 5.0 illustrates step 2 of the flow diagram of Figure 4 in more detail.

FIG. 6.0 illustrates step 4 of the flow diagram of Figure 4 in more detail. FIG. 7.0 illustrates step 5 of the flow diagram of Figure 4 in more detail. FIG. 8.0 illustrates step 6 of the flow diagram of Figure 4 in more detail. FIG. 9.0 illustrates step 7 of the flow diagram of Figure 4 in more detail. FIG. 10.0 illustrates step 7.3 of the flow diagram of Figure 9 in more detail.

FIG. 11.0 illustrates diagrammatically instantiation of the virtual machine service. FIG. 12 illustrates diagrammatically emulation and execution of the service request and prediction of service latency.

Figure 13 illustrates diagrammatically the formation of servicing performance zones.

Figure 14 illustrates diagrammatically the determination and optimisation of resource reservation.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention provides a system and method for virtual machine reservation for delay sensitive service applications. In particular, the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.

Hereinafter, this specification will describe the present invention according to the preferred embodiments. It is to be understood that limiting the description to the preferred embodiments of the invention is merely to facilitate discussion of the present invention and it is envisioned without departing from the scope of the appended claims.

Referring to Figure 1.0, the system (100) according to an embodiment of the invention is illustrated. The system (100) includes a servicing module (110) that is adapted to receive and manage a service request (120) from a client network. A scheduling module (130) is in communication with the servicing module (110) and facilitates the provision and deployment of virtual machines in order to fulfil the service requests (120). A prediction module (140) is provided and is configured to predict service latency of unmeasured virtual machine resources and communicate estimated service delay to the servicing module (110). A measurement module (150) is also provided. This is configured to trigger measurement of service latency of virtual machines which emulate the cloud computing service.

A more detailed description of the servicing module (110) may be seen with reference to Figure 2.0. More particularly, the servicing module (110) includes a service request handler (112) configured to input a service configuration for the service request. The service configuration may comprise a service type and/or a range of tolerable service response times. In addition, the servicing module includes a planning module (114) which is configured to identify a set of cloud computing services to be deployed on virtual machines for service latency computation. A policy making module (116) is also includes and is configured to receive the predicted service latency and measured service latency. The policy making module (116) also estimates total service response time and defines policy to optimise virtual machine resources to be reserved. The planning module (1 14) includes a task categorisation module (118) that is configured to classify tasks required to satisfy the service request that has been made by the network client. It also includes a task provisioning module (1 19) that is configured to identify at least one virtual machine available and required to satisfy the service by forming a service performance zone based on a pre-defined service response time. This will be discussed in more detail below.

Referring to Figure 3, the scheduling module (130) comprises a scheduler (132) that is configured to deploy virtual machines for service emulation. The scheduler (132) is adapted to reserve virtual machines according to a given policy defined by the policy making module (1 16), and to shutdown virtual machines for resource optimisation.

The prediction module (140) generally includes an estimation module (142) and a tree construction module (144). The estimation module (142) is configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s), while the tree construction module (144) is configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources. The measurement module (150) includes a controller module (152) and a repository handler module (154). The controller module ( 52) is configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to the prediction module (140), while the repository handler module (154) is configured to retrieve historical service latency data for selected virtual machines and feedback to the prediction module (140).

Referring to Figures 4 through 8, an embodiment of the method (400) of the invention is illustrated. Generally, the invention includes the steps of receiving a service request from a client network (410), providing cloud computing resources requested (420), instantiating virtual machines service (430), emulating the service request and collecting machine service latency (440), predicting service latency of unmeasured virtual machines (450), forming at least one servicing performance zone based on the service latency of virtual machines (460) and determining virtual machine resources to be reserved (470).

The step of providing cloud computing resources (420) includes the steps of identifying service type and/or required range of service response time (421), determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422), receiving predicted service latency and measured service latency information (423), estimating total service response time (424) and providing a virtual machine resource to be reserved.

The step of emulating the service request (440) includes triggering a set of cloud computing services on at least one selected virtual machine (441), measuring service latency from the selected virtual machine(s) (442), and feedback of the service latency (443).

The step of predicting service latency of unmeasured virtual machines (450) includes selecting at least two virtual machines (451), receiving at least one service latency measurement (452), constructing at least one prediction tree (453) and predicting service latency of unmeasured virtual machines (454).

The step of forming at least one servicing performance zone (460) includes retrieving service latency information (461), identifying a range of service response time (462), identifying response intervals for servicing performance zone(s) (463) and forming the performance service zone(s) on the prediction tree based on the range of service response time (464).

The step of determining virtual machine resources to be reserved (470) includes identifying the number of virtual machines available in each servicing performance zone (471), determining service response times for each servicing performance zone (472), calculating the virtual resource required to fulfil the pre-defined service response time (473) and, if the pre-defined service response time is not fulfilled by the service response time within the servicing performance zone, deploying additional virtual machines (474), or, if the service response time within the servicing performance zone outperforms the pre-defined service response time, shutting down virtual machine(s) (475). Calculating the virtual resource required to fulfil the pre-defined service response time (473) includes determining at least one appointed host (476), determining the number of virtual machines to be reserved at the appointed host (477), determining the number of CPU resources to be reserved at the appointed host (478); and determining the number of memory resources to be reserved at the appointed host.

Turning to Figures 1 1 through 14, alternative representations of various steps of an embodiment of the method of the invention are provided. In particular, Figure 11 illustrates diagrammatically instantiation of the virtual machine service 430. As noted, this step involves identification of an application server, request for virtual machines and subsequent instantiation of the virtual machines. Emulation and execution of the service request 440 and prediction of service latency (450) are diagrammatically illustrated in Figure 12, while Figure 13 provides a diagrammatic representation of the formation of servicing performance zones (460). Finally, Figure 14 illustrates determination and optimisation of resource reservation (470).

Unless the context requires otherwise or specifically stated to the contrary, integers, steps or elements of the invention recited herein as singular integers, steps or elements clearly encompass both singular and plural forms of the recited integers, steps or elements.

Throughout this specification, unless the context requires otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated step or element or integer or group of steps or elements or integers, but not the exclusion of any other step or element or integer or group of steps, elements or integers. Thus, in the context of this specification, the term "comprising" is used in an inclusive sense and thus should be understood as meaning "including principally, but not necessarily solely".

It will be appreciated that the foregoing description has been given by way of illustrative example of the invention and that all such modifications and variations thereto as would be apparent to persons of skill in the art are deemed to fall within the broad scope and ambit of the invention as herein set forth.

Claims

1. A system for virtual machine reservation for delay sensitive service applications, the system comprising:

at least one servicing module configured to manage cloud computing service requests;

at least one scheduling module configured to provide and deploy virtual machines for cloud computing services to fulfil said requests; at least one prediction module configured to predict service latency of unmeasured virtual machine resources; and

at least one measurement module configured to measure service latency of virtual machines which emulate said cloud computing service, characterised in that said scheduling module is configured to provide and deploys said virtual machines to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines; said scheduling module further comprises a scheduler configured to deploy virtual machines for service emulation, to reserve virtual machines according to a given policy defined by policy making module, and to shutdown virtual machines for resource optimisation.

2. A system according to claim 1 , wherein said servicing module further comprises:

a service request handler configured to input a service configuration of a service type and/or a range of tolerable service response times for said service request;

a planning module configured to identify a set of cloud computing services to be deployed on virtual machines for service latency

computation; and

a policy making module configured to receive said predicted service latency and said measured service latency, estimate total service response time and define policy to optimise virtual machine resources to be reserved.

A system according to claim 2 , wherein said planning module further comprises: a task categorisation module configured to classify tasks required to satisfy said service request; and

a task provisioning module configured to identify at least one virtual machine available and required to satisfy said service by forming a service performance zone based on said pre-defined service response time.

4. A system according to claim 2 , wherein said policy making module further comprises:

at least one appointed host;

a plurality of virtual machines to be reserved at said appointed host;

a plurality of CPU resources to be reserved at said appointed host; and a plurality of memory resources to be reserved at said appointed host.

5. A system according to claim 1 , wherein said prediction module further comprises:

an estimation module configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s); and

a tree construction module configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources.

A system according to claim 1 , wherein said measurement module comprises: a controller module configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to said prediction module; and

a repository handler module configured to retrieve historical service latency data for selected virtual machines and feedback to said prediction module.

7. A method for virtual machine reservation for delay sensitive service applications, the method comprising steps of:

receiving service request from client network (410);

providing cloud computing resources requested (420) by identifying service type and/or required range of service response time (421);

determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422); receiving predicted service latency and measured service latency information (423); estimating total service response time (424); and providing a virtual machine resource to be reserved (425);

instantiating virtual machines service (430);

emulating said service request and collecting machine service latency (440) by triggering a set of cloud computing services on at least one selected virtual machine (441); measuring service latency from said selected virtual machine(s) (442); and

providing feedback of said service latency (443);

predicting service latency of unmeasured virtual machines (450);

forming at least one servicing performance zone based on said service latency of virtual machines (460); and

determining virtual machine resources to be reserved (470) characterised in that determining virtual machine resources to be reserved to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines further comprises steps of: identifying number of virtual machines available in each servicing performance zone (471);

determining service response times for each servicing

performance zone (472);

calculating virtual resource required to fulfil said pre-defined service response time (473);

deploying additional virtual machines if said pre-defined service response time is not fulfilled by said service response time within said servicing performance zone (474); and shutting down virtual machine(s) if said service response time within said servicing performance zone outperforms said predefined service response time (475).

A method according to claim 7, wherein predicting service latency of unmeasured virtual machines further comprises steps of:

selecting at least two virtual machines (451);

receiving at least one service latency measurement (452);

constructing at least one prediction tree (453); and

predicting service latency of unmeasured virtual machines (454).

A method according to claim 7, wherein forming at least one servicing performance zone further comprises steps of:

retrieving service latency information (461);

identifying a range of service response time (462);

identifying response intervals for servicing performance zone(s) (463); and

forming said performance service zone(s) on said prediction tree based on said range of service response time (464).

A method according to claim 7, wherein calculating the virtual resource required to fulfil said pre-defined service response time further comprises steps of:

determining at least one appointed host (476);

determining the number of virtual machines to be reserved at said appointed host (477);

determining the number of CPU resources to be reserved at said appointed host (478); and

determining the number of memory resources to be reserved at said appointed host (479).