WO2014073949A1 - A system and method for virtual machine reservation for delay sensitive service applications - Google Patents
A system and method for virtual machine reservation for delay sensitive service applications Download PDFInfo
- Publication number
- WO2014073949A1 WO2014073949A1 PCT/MY2013/000191 MY2013000191W WO2014073949A1 WO 2014073949 A1 WO2014073949 A1 WO 2014073949A1 MY 2013000191 W MY2013000191 W MY 2013000191W WO 2014073949 A1 WO2014073949 A1 WO 2014073949A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- service
- virtual machines
- latency
- virtual machine
- module
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 24
- 230000004044 response Effects 0.000 claims abstract description 54
- 238000005259 measurement Methods 0.000 claims abstract description 23
- 238000010276 construction Methods 0.000 claims description 4
- 238000004891 communication Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 7
- 238000007726 management method Methods 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
Definitions
- the present invention relates to a system and method for virtual machine reservation for delay sensitive service applications.
- the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.
- a request for bond price predictions may require intensive computation and the results may be required by a customer anywhere and at any time within seconds of the request.
- the SaaS provider may not be able to predict the requirement of computer resources as the SaaS provided may not know when, where or how much computing resources are needed.
- maximum numbers of virtual machines need to be deployed at dispersed locations which generally results in wasted resources. In particular, this may result in virtual machines standing idle when no service request is received.
- United States Patent Publication No. 2011/0231899 describes a system that provides a cloud-computing service from a cloud-computing environment comprising a plurality of cloud-computing resources.
- the system comprises a management module configured to manage a cloud-computing resource of the plurality of cloud-computing resources as a cloud-computing service.
- the cloud-computing service performs a computer workload.
- the system also comprises an adapter configured to connect to the cloud-computing resource to the system and translate a management instruction received from the management module into a proprietary cloud application program interface call for the cloud-computing resource.
- a cloud service bus is provided that is configured to route the management instruction from the management module to the adapter and a consumption module is provided that is configured to allow a user to subscribe the cloud-computing service.
- a planning module is provided that is configured to plan the cloud-computing service and a build module is provided that is configured to build the cloud-computing service from the cloud-computing resource and publish the cloud-computing service to the consumption module.
- This publication exemplifies a system that involves reactive service provisioning in which the allocation of virtual machines is based on the computational workload assigned, and in which planning, management and spawning of virtual machines is based on service requests.
- the scheduling policy described is not application aware, but focuses on virtual machine allocation based on hardware resource availability.
- United States Patent Publication No. 2008/0304421 describes a prediction tree for estimating values of a network performance measure.
- Leaf nodes of the prediction tree are associated with networked computing devices and interior nodes are not necessarily representative of physical network connections.
- Values are assigned to edges in the prediction tree and the network performance measure relative to two computing devices represented by two nodes of the tree is estimated by aggregating the values assigned to the edges in the path in the prediction tree joining the two edges.
- Mechanisms for adding nodes representing computing devices to the prediction tree, for identifying a closest node representing a computing device in the prediction tree, for identifying a cluster of devices represented by nodes of the tree, and for rebalancing the prediction tree are provided.
- This publication exemplifies systems based on the well-known Euclidean Steiner Tree Model in combinatorial optimisation.
- the input as exemplified in this publication relates to inter-nodal network performance measurements (i.e. network latency only).
- Optimisation of the prediction tree does not refer to the mechanism behind network node selection.
- the subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one exemplary technology area where some embodiments described herein may be practiced.
- the present invention relates to a system and method for virtual machine reservation for delay sensitive service applications.
- the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.
- the present invention provides a system for virtual machine reservation for delay sensitive service applications.
- the system comprising at least one servicing module configured to manage cloud computing service requests; at least one scheduling module configured to provide and deploy virtual machines for cloud computing services to fulfil the requests; at least one prediction module configured to predict service latency of unmeasured virtual machine resources; and at least one measurement module configured to trigger measurement of service latency of virtual machines which emulate the cloud computing service, characterised in that the scheduling module is configured to provide and deploys the virtual machines to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines.
- the said scheduling module further comprises a scheduler configured to deploy virtual machines for service emulation, to reserve virtual machines according to a given policy defined by policy making module, and to shutdown virtual machines for resource optimisation.
- the servicing module further comprises a service request handler configured to input a service configuration of a service type and/or a range of tolerable service response times for the service request; a planning module configured to identify a set of cloud computing services to be deployed on virtual machines for service latency computation; and a policy making module configured to receive the predicted service latency and the measured service latency, estimate total service response time and define policy to optimise virtual machine resources to be reserved.
- the planning module further comprises a task categorisation module configured to classify tasks required to satisfy the service request; and a task provisioning module configured to identify at least one virtual machine available and required to satisfy the service by forming a service performance zone based on the pre-defined service response time.
- the policy making module further comprises at least one appointed host a plurality of virtual machines to be reserved at the appointed host; a plurality of CPU resources to be reserved at the appointed host; and a plurality of memory resources to be reserved at the appointed host.
- the prediction module further comprises an estimation module configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s); and a tree construction module configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources.
- the measurement module comprises a controller module configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to the prediction module; and a repository handler module configured to retrieve historical service latency data for selected virtual machines and feedback to the prediction module.
- the invention provides a method for virtual machine reservation for delay sensitive service applications comprising receiving a service request from a client network (410);providing cloud computing resources requested (420); instantiating virtual machines service (430); emulating the service request and collecting machine service latency (440); predicting service latency of unmeasured virtual machines (450);forming at least one servicing performance zone based on the service latency of virtual machines (460); and determining virtual machine resources to be reserved (470), characterised in that the virtual machines are reserved to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines further comprises steps of.
- providing cloud computing resources comprises identifying service type and/or required range of service response time (421); determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422); receiving predicted service latency and measured service latency information (423); estimating total service response time (424); and providing a virtual machine resource to be reserved (425).
- emulating the service request comprises triggering a set of cloud computing services on at least one selected virtual machine (441); measuring service latency from the selected virtual machine(s) (442); and feedback of the service latency (443).
- predicting service latency of unmeasured virtual machines comprises selecting at least two virtual machines (451); receiving at least one service latency measurement (452); constructing at least one prediction tree (453); and predicting service latency of unmeasured virtual machines (454).
- forming at least one servicing performance zone comprises retrieving service latency information (461); identifying a range of service response time (462); identifying response intervals for servicing performance zone(s) (463); and forming the performance service zone(s) on the prediction tree based on the range of service response time (464).
- calculating the virtual resource required to fulfil the pre-defined service response time comprises determining at least one appointed host (476); determining the number of virtual machines to be reserved at the appointed host (477); determining the number of CPU resources to be reserved at the appointed host (478); and determining the number of memory resources to be reserved at the appointed host (479).
- FIG. 1.0 illustrates the system of an embodiment of the invention.
- FIG. 2.0 illustrates the servicing module of the system of Figure 1.0 in more detail.
- FIG. 3.0 illustrates the protection module, measurement module and scheduling module of the system of Figure 1.0 in more detail.
- FIG. 4.0 illustrates a flow diagram of the method of an embodiment of the invention.
- FIG. 5.0 illustrates step 2 of the flow diagram of Figure 4 in more detail.
- FIG. 6.0 illustrates step 4 of the flow diagram of Figure 4 in more detail.
- FIG. 7.0 illustrates step 5 of the flow diagram of Figure 4 in more detail.
- FIG. 8.0 illustrates step 6 of the flow diagram of Figure 4 in more detail.
- FIG. 9.0 illustrates step 7 of the flow diagram of Figure 4 in more detail.
- FIG. 10.0 illustrates step 7.3 of the flow diagram of Figure 9 in more detail.
- FIG. 11.0 illustrates diagrammatically instantiation of the virtual machine service.
- FIG. 12 illustrates diagrammatically emulation and execution of the service request and prediction of service latency.
- Figure 13 illustrates diagrammatically the formation of servicing performance zones.
- Figure 14 illustrates diagrammatically the determination and optimisation of resource reservation.
- the present invention provides a system and method for virtual machine reservation for delay sensitive service applications.
- the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.
- the system (100) includes a servicing module (110) that is adapted to receive and manage a service request (120) from a client network.
- a scheduling module (130) is in communication with the servicing module (110) and facilitates the provision and deployment of virtual machines in order to fulfil the service requests (120).
- a prediction module (140) is provided and is configured to predict service latency of unmeasured virtual machine resources and communicate estimated service delay to the servicing module (110).
- a measurement module (150) is also provided. This is configured to trigger measurement of service latency of virtual machines which emulate the cloud computing service.
- the servicing module (110) includes a service request handler (112) configured to input a service configuration for the service request.
- the service configuration may comprise a service type and/or a range of tolerable service response times.
- the servicing module includes a planning module (114) which is configured to identify a set of cloud computing services to be deployed on virtual machines for service latency computation.
- a policy making module (116) is also includes and is configured to receive the predicted service latency and measured service latency. The policy making module (116) also estimates total service response time and defines policy to optimise virtual machine resources to be reserved.
- the planning module (1 14) includes a task categorisation module (118) that is configured to classify tasks required to satisfy the service request that has been made by the network client. It also includes a task provisioning module (1 19) that is configured to identify at least one virtual machine available and required to satisfy the service by forming a service performance zone based on a pre-defined service response time. This will be discussed in more detail below.
- the scheduling module (130) comprises a scheduler (132) that is configured to deploy virtual machines for service emulation.
- the scheduler (132) is adapted to reserve virtual machines according to a given policy defined by the policy making module (1 16), and to shutdown virtual machines for resource optimisation.
- the prediction module (140) generally includes an estimation module (142) and a tree construction module (144).
- the estimation module (142) is configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s), while the tree construction module (144) is configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources.
- the measurement module (150) includes a controller module (152) and a repository handler module (154).
- the controller module ( 52) is configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to the prediction module (140), while the repository handler module (154) is configured to retrieve historical service latency data for selected virtual machines and feedback to the prediction module (140).
- the invention includes the steps of receiving a service request from a client network (410), providing cloud computing resources requested (420), instantiating virtual machines service (430), emulating the service request and collecting machine service latency (440), predicting service latency of unmeasured virtual machines (450), forming at least one servicing performance zone based on the service latency of virtual machines (460) and determining virtual machine resources to be reserved (470).
- the step of providing cloud computing resources (420) includes the steps of identifying service type and/or required range of service response time (421), determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422), receiving predicted service latency and measured service latency information (423), estimating total service response time (424) and providing a virtual machine resource to be reserved.
- the step of emulating the service request (440) includes triggering a set of cloud computing services on at least one selected virtual machine (441), measuring service latency from the selected virtual machine(s) (442), and feedback of the service latency (443).
- the step of predicting service latency of unmeasured virtual machines (450) includes selecting at least two virtual machines (451), receiving at least one service latency measurement (452), constructing at least one prediction tree (453) and predicting service latency of unmeasured virtual machines (454).
- the step of forming at least one servicing performance zone (460) includes retrieving service latency information (461), identifying a range of service response time (462), identifying response intervals for servicing performance zone(s) (463) and forming the performance service zone(s) on the prediction tree based on the range of service response time (464).
- the step of determining virtual machine resources to be reserved includes identifying the number of virtual machines available in each servicing performance zone (471), determining service response times for each servicing performance zone (472), calculating the virtual resource required to fulfil the pre-defined service response time (473) and, if the pre-defined service response time is not fulfilled by the service response time within the servicing performance zone, deploying additional virtual machines (474), or, if the service response time within the servicing performance zone outperforms the pre-defined service response time, shutting down virtual machine(s) (475).
- Calculating the virtual resource required to fulfil the pre-defined service response time includes determining at least one appointed host (476), determining the number of virtual machines to be reserved at the appointed host (477), determining the number of CPU resources to be reserved at the appointed host (478); and determining the number of memory resources to be reserved at the appointed host.
- Figure 11 illustrates diagrammatically instantiation of the virtual machine service 430. As noted, this step involves identification of an application server, request for virtual machines and subsequent instantiation of the virtual machines. Emulation and execution of the service request 440 and prediction of service latency (450) are diagrammatically illustrated in Figure 12, while Figure 13 provides a diagrammatic representation of the formation of servicing performance zones (460). Finally, Figure 14 illustrates determination and optimisation of resource reservation (470).
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computer And Data Communications (AREA)
Abstract
The invention provides a system (100) including a servicing module (110) that is adapted to receive and manage a service request (120) from a client network. A scheduling module (130) is in communication with the servicing module (110) and facilitates the provision and deployment of virtual machines in order to fulfil the service requests (120). A prediction module (140) is provided and is configured to predict service latency of unmeasured virtual machine resources and communicate estimated service delay to the servicing module (110). A measurement module (150) is also provided. This is configured to measure service latency of virtual machines which emulate the cloud computing service. The scheduling module (130) is configured to provide and deploys said virtual machines to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines.
Description
A SYSTEM AND METHOD FOR VIRTUAL MACHINE RESERVATION FOR DELAY
SENSITIVE SERVICE APPLICATIONS
FIELD OF INVENTION
The present invention relates to a system and method for virtual machine reservation for delay sensitive service applications. In particular, the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.
BACKGROUND ART
Current systems and methods implement reservation of virtual machines solely on cloud resource availability. Generally, virtual machine resources are maximised to assure service response time which results in ongoing wasted resources. More particularly, in current systems cloud software as a service (SaaS) providers need to place maximum cloud infrastructure in advance to ensure smooth service operation. SaaS providers are therefore generally required to provide expected virtual machine requests to ensure infrastructures are deployed and placed on standby for service. Lease requests are generally made on the basis of number of hosts, number of CPUs, amount of memory required and time. For example, a client may advise they need 10 nodes, each with 2 CPUs, 4GB of memory, from 2pm to 4pm every day. Lease request method is unsuitable for cloud service providers that provide delay sensitive services. For example, a request for bond price predictions may require intensive computation and the results may be required by a customer anywhere and at any time within seconds of the request. As such, the SaaS provider may not be able to predict the requirement of computer resources as the SaaS provided may not know when, where or how much computing resources are needed. In order to fulfil service level agreements, maximum numbers of virtual machines need to be deployed at dispersed locations which generally results in wasted resources. In particular, this may result in virtual machines standing idle when no service request is received.
United States Patent Publication No. 2011/0231899 describes a system that provides a cloud-computing service from a cloud-computing environment comprising a plurality of cloud-computing resources. In certain embodiments, the system comprises a management module configured to manage a cloud-computing resource of the plurality of cloud-computing resources as a cloud-computing service. Generally, the cloud-computing service performs a computer workload. The system also comprises an adapter configured to connect to the cloud-computing resource to the system and translate a management instruction received from the management module into a proprietary cloud application program interface call for the cloud-computing resource. A cloud service bus is provided that is configured to route the management instruction from the management module to the adapter and a consumption module is provided that is configured to allow a user to subscribe the cloud-computing service. Finally, a planning module is provided that is configured to plan the cloud-computing service and a build module is provided that is configured to build the cloud-computing service from the cloud-computing resource and publish the cloud-computing service to the consumption module.
This publication exemplifies a system that involves reactive service provisioning in which the allocation of virtual machines is based on the computational workload assigned, and in which planning, management and spawning of virtual machines is based on service requests. The scheduling policy described is not application aware, but focuses on virtual machine allocation based on hardware resource availability.
United States Patent Publication No. 2008/0304421 describes a prediction tree for estimating values of a network performance measure. Leaf nodes of the prediction tree are associated with networked computing devices and interior nodes are not necessarily representative of physical network connections. Values are assigned to edges in the prediction tree and the network performance measure relative to two computing devices represented by two nodes of the tree is estimated by aggregating the values assigned to the edges in the path in the prediction tree joining the two edges. Mechanisms for adding nodes representing computing devices to the prediction tree, for identifying a closest node representing a computing device in the prediction tree, for identifying a cluster of
devices represented by nodes of the tree, and for rebalancing the prediction tree are provided.
This publication exemplifies systems based on the well-known Euclidean Steiner Tree Model in combinatorial optimisation. The input as exemplified in this publication relates to inter-nodal network performance measurements (i.e. network latency only). Optimisation of the prediction tree does not refer to the mechanism behind network node selection. The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one exemplary technology area where some embodiments described herein may be practiced.
SUMMARY OF INVENTION
The present invention relates to a system and method for virtual machine reservation for delay sensitive service applications. In particular, the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.
One aspect of the present invention provides a system for virtual machine reservation for delay sensitive service applications. The system comprising at least one servicing module configured to manage cloud computing service requests; at least one scheduling module configured to provide and deploy virtual machines for cloud computing services to fulfil the requests; at least one prediction module configured to predict service latency of unmeasured virtual machine resources; and at least one measurement module configured to trigger measurement of service latency of virtual machines which emulate the cloud computing service, characterised in that the scheduling module is configured to provide and deploys the virtual machines to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines. The said scheduling module further comprises a scheduler configured to deploy virtual machines for service emulation, to reserve virtual machines according to a given policy defined by policy making module, and to shutdown virtual machines for resource optimisation.
In another aspect the invention provides a system wherein the servicing module further comprises a service request handler configured to input a service configuration of a service type and/or a range of tolerable service response times for the service request; a planning module configured to identify a set of cloud computing services to be deployed on virtual machines for service latency computation; and a policy making module configured to receive the predicted service latency and the measured service latency, estimate total service response time and define policy to optimise virtual machine resources to be reserved.
In yet another aspect of the invention there is provided a system wherein the planning module further comprises a task categorisation module configured to classify tasks required to satisfy the service request; and a task provisioning module configured to identify at least one virtual machine available and required to satisfy the service by forming a service performance zone based on the pre-defined service response time.
In still another aspect of the invention there is provided a system wherein the policy making module further comprises at least one appointed host a plurality of virtual machines to be reserved at the appointed host; a plurality of CPU resources to be reserved at the appointed host; and a plurality of memory resources to be reserved at the appointed host.
In a further aspect of the invention there is provided a system wherein the prediction module further comprises an estimation module configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s); and a tree construction module configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources.
In another aspect of the invention there is provided a system wherein the measurement module comprises a controller module configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to the prediction module; and a repository handler module configured to retrieve historical service latency data for selected virtual machines and feedback to the prediction module.
In another aspect the invention provides a method for virtual machine reservation for delay sensitive service applications comprising receiving a service request from a client network (410);providing cloud computing resources requested (420); instantiating virtual machines service (430); emulating the service request and collecting machine service latency (440); predicting service latency of unmeasured virtual machines (450);forming at least one servicing performance zone based on the service latency of virtual machines (460); and determining virtual machine resources to be reserved (470), characterised in that the virtual machines are reserved to satisfy a pre-defined service
response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines further comprises steps of. identifying number of virtual machines available in each servicing performance zone (471); determining service response times for each servicing performance zone (472); calculating the virtual resource required to fulfil the pre-defined service response time (473); and deploying additional virtual machines if the pre-defined service response time is not fulfilled by the service response time within the servicing performance zone (474); and shutting down virtual machine (s) if the service response time within the servicing performance zone outperforms the pre-defined service response time (475).
In a further aspect of the invention there is provided a method wherein providing cloud computing resources comprises identifying service type and/or required range of service response time (421); determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422); receiving predicted service latency and measured service latency information (423); estimating total service response time (424); and providing a virtual machine resource to be reserved (425).
In still a further aspect of the invention there is provided a method wherein emulating the service request comprises triggering a set of cloud computing services on at least one selected virtual machine (441); measuring service latency from the selected virtual machine(s) (442); and feedback of the service latency (443).
In yet another aspect of the invention there is provided a method wherein predicting service latency of unmeasured virtual machines comprises selecting at least two virtual machines (451); receiving at least one service latency measurement (452); constructing at least one prediction tree (453); and predicting service latency of unmeasured virtual machines (454).
In another aspect of the invention there is provided a method wherein forming at least one servicing performance zone comprises retrieving service latency information (461); identifying a range of service response time (462); identifying response intervals for servicing performance zone(s) (463); and forming the performance service zone(s) on the prediction tree based on the range of service response time (464).
In yet another aspect the invention provides a method wherein calculating the virtual resource required to fulfil the pre-defined service response time comprises determining at least one appointed host (476); determining the number of virtual machines to be reserved at the appointed host (477); determining the number of CPU resources to be reserved at the appointed host (478); and determining the number of memory resources to be reserved at the appointed host (479).
The present invention consists of features and a combination of parts hereinafter fully described and illustrated in the accompanying drawings, it being understood that various changes in the details may be made without departing from the scope of the invention or sacrificing any of the advantages of the present invention.
BRIEF DESCRIPTION OF ACCOMPANYING DRAWINGS
To further clarify various aspects of some embodiments of the present invention, a more particular description of the invention will be rendered by references to specific embodiments thereof, which are illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail through the accompanying drawings in which: FIG. 1.0 illustrates the system of an embodiment of the invention.
FIG. 2.0 illustrates the servicing module of the system of Figure 1.0 in more detail.
FIG. 3.0 illustrates the protection module, measurement module and scheduling module of the system of Figure 1.0 in more detail.
FIG. 4.0 illustrates a flow diagram of the method of an embodiment of the invention.
FIG. 5.0 illustrates step 2 of the flow diagram of Figure 4 in more detail.
FIG. 6.0 illustrates step 4 of the flow diagram of Figure 4 in more detail. FIG. 7.0 illustrates step 5 of the flow diagram of Figure 4 in more detail. FIG. 8.0 illustrates step 6 of the flow diagram of Figure 4 in more detail. FIG. 9.0 illustrates step 7 of the flow diagram of Figure 4 in more detail. FIG. 10.0 illustrates step 7.3 of the flow diagram of Figure 9 in more detail.
FIG. 11.0 illustrates diagrammatically instantiation of the virtual machine service.
FIG. 12 illustrates diagrammatically emulation and execution of the service request and prediction of service latency.
Figure 13 illustrates diagrammatically the formation of servicing performance zones.
Figure 14 illustrates diagrammatically the determination and optimisation of resource reservation.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The present invention provides a system and method for virtual machine reservation for delay sensitive service applications. In particular, the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.
Hereinafter, this specification will describe the present invention according to the preferred embodiments. It is to be understood that limiting the description to the preferred embodiments of the invention is merely to facilitate discussion of the present invention and it is envisioned without departing from the scope of the appended claims.
Referring to Figure 1.0, the system (100) according to an embodiment of the invention is illustrated. The system (100) includes a servicing module (110) that is adapted to receive and manage a service request (120) from a client network. A scheduling module (130) is in communication with the servicing module (110) and facilitates the provision and deployment of virtual machines in order to fulfil the service requests (120). A prediction module (140) is provided and is configured to predict service latency of unmeasured virtual machine resources and communicate estimated service delay to the servicing module (110). A measurement module (150) is also provided. This is configured to trigger measurement of service latency of virtual machines which emulate the cloud computing service.
A more detailed description of the servicing module (110) may be seen with reference to Figure 2.0. More particularly, the servicing module (110) includes a service request handler (112) configured to input a service configuration for the service request. The service configuration may comprise a service type and/or a range of tolerable service response times. In addition, the servicing module includes a planning module (114) which is configured to identify a set of cloud computing services to be deployed on virtual machines for service latency computation. A policy making module (116) is also includes and is configured to receive the predicted service latency and measured service latency. The policy making module (116) also estimates total service response time and defines policy to optimise virtual machine resources to be reserved. The planning
module (1 14) includes a task categorisation module (118) that is configured to classify tasks required to satisfy the service request that has been made by the network client. It also includes a task provisioning module (1 19) that is configured to identify at least one virtual machine available and required to satisfy the service by forming a service performance zone based on a pre-defined service response time. This will be discussed in more detail below.
Referring to Figure 3, the scheduling module (130) comprises a scheduler (132) that is configured to deploy virtual machines for service emulation. The scheduler (132) is adapted to reserve virtual machines according to a given policy defined by the policy making module (1 16), and to shutdown virtual machines for resource optimisation.
The prediction module (140) generally includes an estimation module (142) and a tree construction module (144). The estimation module (142) is configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s), while the tree construction module (144) is configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources. The measurement module (150) includes a controller module (152) and a repository handler module (154). The controller module ( 52) is configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to the prediction module (140), while the repository handler module (154) is configured to retrieve historical service latency data for selected virtual machines and feedback to the prediction module (140).
Referring to Figures 4 through 8, an embodiment of the method (400) of the invention is illustrated. Generally, the invention includes the steps of receiving a service request from a client network (410), providing cloud computing resources requested (420), instantiating virtual machines service (430), emulating the service request and collecting machine service latency (440), predicting service latency of unmeasured virtual machines (450), forming at least one servicing performance zone based on the service
latency of virtual machines (460) and determining virtual machine resources to be reserved (470).
The step of providing cloud computing resources (420) includes the steps of identifying service type and/or required range of service response time (421), determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422), receiving predicted service latency and measured service latency information (423), estimating total service response time (424) and providing a virtual machine resource to be reserved.
The step of emulating the service request (440) includes triggering a set of cloud computing services on at least one selected virtual machine (441), measuring service latency from the selected virtual machine(s) (442), and feedback of the service latency (443).
The step of predicting service latency of unmeasured virtual machines (450) includes selecting at least two virtual machines (451), receiving at least one service latency measurement (452), constructing at least one prediction tree (453) and predicting service latency of unmeasured virtual machines (454).
The step of forming at least one servicing performance zone (460) includes retrieving service latency information (461), identifying a range of service response time (462), identifying response intervals for servicing performance zone(s) (463) and forming the performance service zone(s) on the prediction tree based on the range of service response time (464).
The step of determining virtual machine resources to be reserved (470) includes identifying the number of virtual machines available in each servicing performance zone (471), determining service response times for each servicing performance zone (472), calculating the virtual resource required to fulfil the pre-defined service response time (473) and, if the pre-defined service response time is not fulfilled by the service response time within the servicing performance zone, deploying additional virtual machines (474), or, if the service response time within the servicing performance zone outperforms the
pre-defined service response time, shutting down virtual machine(s) (475). Calculating the virtual resource required to fulfil the pre-defined service response time (473) includes determining at least one appointed host (476), determining the number of virtual machines to be reserved at the appointed host (477), determining the number of CPU resources to be reserved at the appointed host (478); and determining the number of memory resources to be reserved at the appointed host.
Turning to Figures 1 1 through 14, alternative representations of various steps of an embodiment of the method of the invention are provided. In particular, Figure 11 illustrates diagrammatically instantiation of the virtual machine service 430. As noted, this step involves identification of an application server, request for virtual machines and subsequent instantiation of the virtual machines. Emulation and execution of the service request 440 and prediction of service latency (450) are diagrammatically illustrated in Figure 12, while Figure 13 provides a diagrammatic representation of the formation of servicing performance zones (460). Finally, Figure 14 illustrates determination and optimisation of resource reservation (470).
Unless the context requires otherwise or specifically stated to the contrary, integers, steps or elements of the invention recited herein as singular integers, steps or elements clearly encompass both singular and plural forms of the recited integers, steps or elements.
Throughout this specification, unless the context requires otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated step or element or integer or group of steps or elements or integers, but not the exclusion of any other step or element or integer or group of steps, elements or integers. Thus, in the context of this specification, the term "comprising" is used in an inclusive sense and thus should be understood as meaning "including principally, but not necessarily solely".
It will be appreciated that the foregoing description has been given by way of illustrative example of the invention and that all such modifications and variations thereto as would be apparent to persons of skill in the art are deemed to fall within the broad scope and ambit of the invention as herein set forth.
Claims
1. A system for virtual machine reservation for delay sensitive service applications, the system comprising:
at least one servicing module configured to manage cloud computing service requests;
at least one scheduling module configured to provide and deploy virtual machines for cloud computing services to fulfil said requests; at least one prediction module configured to predict service latency of unmeasured virtual machine resources; and
at least one measurement module configured to measure service latency of virtual machines which emulate said cloud computing service, characterised in that said scheduling module is configured to provide and deploys said virtual machines to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines; said scheduling module further comprises a scheduler configured to deploy virtual machines for service emulation, to reserve virtual machines according to a given policy defined by policy making module, and to shutdown virtual machines for resource optimisation.
2. A system according to claim 1 , wherein said servicing module further comprises:
a service request handler configured to input a service configuration of a service type and/or a range of tolerable service response times for said service request;
a planning module configured to identify a set of cloud computing services to be deployed on virtual machines for service latency
computation; and
a policy making module configured to receive said predicted service latency and said measured service latency, estimate total service response time and define policy to optimise virtual machine resources to be reserved.
A system according to claim 2 , wherein said planning module further comprises: a task categorisation module configured to classify tasks required to satisfy said service request; and
a task provisioning module configured to identify at least one virtual machine available and required to satisfy said service by forming a service performance zone based on said pre-defined service response time.
4. A system according to claim 2 , wherein said policy making module further comprises:
at least one appointed host;
a plurality of virtual machines to be reserved at said appointed host;
a plurality of CPU resources to be reserved at said appointed host; and a plurality of memory resources to be reserved at said appointed host.
5. A system according to claim 1 , wherein said prediction module further comprises:
an estimation module configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s); and
a tree construction module configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources.
A system according to claim 1 , wherein said measurement module comprises: a controller module configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to said prediction module; and
a repository handler module configured to retrieve historical service latency data for selected virtual machines and feedback to said prediction module.
7. A method for virtual machine reservation for delay sensitive service applications, the method comprising steps of:
receiving service request from client network (410);
providing cloud computing resources requested (420) by identifying service type and/or required range of service response time (421);
determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422); receiving predicted service latency and measured service latency information (423); estimating total service response time (424); and providing a virtual machine resource to be reserved (425);
instantiating virtual machines service (430);
emulating said service request and collecting machine service latency (440) by triggering a set of cloud computing services on at least one selected virtual machine (441); measuring service latency from said selected virtual machine(s) (442); and
providing feedback of said service latency (443);
predicting service latency of unmeasured virtual machines (450);
forming at least one servicing performance zone based on said service latency of virtual machines (460); and
determining virtual machine resources to be reserved (470) characterised in that determining virtual machine resources to be reserved to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines further comprises steps of: identifying number of virtual machines available in each servicing performance zone (471);
determining service response times for each servicing
performance zone (472);
calculating virtual resource required to fulfil said pre-defined service response time (473);
deploying additional virtual machines if said pre-defined service response time is not fulfilled by said service response time within said servicing performance zone (474); and
shutting down virtual machine(s) if said service response time within said servicing performance zone outperforms said predefined service response time (475).
A method according to claim 7, wherein predicting service latency of unmeasured virtual machines further comprises steps of:
selecting at least two virtual machines (451);
receiving at least one service latency measurement (452);
constructing at least one prediction tree (453); and
predicting service latency of unmeasured virtual machines (454).
A method according to claim 7, wherein forming at least one servicing performance zone further comprises steps of:
retrieving service latency information (461);
identifying a range of service response time (462);
identifying response intervals for servicing performance zone(s) (463); and
forming said performance service zone(s) on said prediction tree based on said range of service response time (464).
A method according to claim 7, wherein calculating the virtual resource required to fulfil said pre-defined service response time further comprises steps of:
determining at least one appointed host (476);
determining the number of virtual machines to be reserved at said appointed host (477);
determining the number of CPU resources to be reserved at said appointed host (478); and
determining the number of memory resources to be reserved at said appointed host (479).
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
MYPI2012004922 | 2012-11-12 | ||
MYPI2012004922 | 2012-11-12 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014073949A1 true WO2014073949A1 (en) | 2014-05-15 |
Family
ID=49765632
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/MY2013/000191 WO2014073949A1 (en) | 2012-11-12 | 2013-11-11 | A system and method for virtual machine reservation for delay sensitive service applications |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2014073949A1 (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105045667A (en) * | 2015-07-13 | 2015-11-11 | 中国科学院计算技术研究所 | Resource pool management method for vCPU scheduling of virtual machines |
CN109002342A (en) * | 2017-06-07 | 2018-12-14 | 中国科学院信息工程研究所 | A kind of computing resource orientation dispatching method and system based on OpenStack |
US10270711B2 (en) | 2017-03-16 | 2019-04-23 | Red Hat, Inc. | Efficient cloud service capacity scaling |
CN111782355A (en) * | 2020-06-03 | 2020-10-16 | 上海交通大学 | A cloud computing task scheduling method and system based on mixed load |
US11720425B1 (en) | 2021-05-20 | 2023-08-08 | Amazon Technologies, Inc. | Multi-tenant radio-based application pipeline processing system |
WO2023192776A1 (en) * | 2022-03-31 | 2023-10-05 | Amazon Technologies, Inc. | Cloud-based orchestration of network functions |
US11800404B1 (en) | 2021-05-20 | 2023-10-24 | Amazon Technologies, Inc. | Multi-tenant radio-based application pipeline processing server |
US11916999B1 (en) | 2021-06-30 | 2024-02-27 | Amazon Technologies, Inc. | Network traffic management at radio-based application pipeline processing servers |
US11985065B2 (en) | 2022-06-16 | 2024-05-14 | Amazon Technologies, Inc. | Enabling isolated virtual network configuration options for network function accelerators |
US12236248B1 (en) | 2021-06-30 | 2025-02-25 | Amazon Technologies, Inc. | Transparent migration of radio-based applications |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1508855A2 (en) * | 2003-08-20 | 2005-02-23 | Katana Technology, Inc. | Method and apparatus for providing virtual computing services |
US20070226449A1 (en) * | 2006-03-22 | 2007-09-27 | Nec Corporation | Virtual computer system, and physical resource reconfiguration method and program thereof |
US20080304421A1 (en) | 2007-06-07 | 2008-12-11 | Microsoft Corporation | Internet Latencies Through Prediction Trees |
US20110231899A1 (en) | 2009-06-19 | 2011-09-22 | ServiceMesh Corporation | System and method for a cloud computing abstraction layer |
US20110307889A1 (en) * | 2010-06-11 | 2011-12-15 | Hitachi, Ltd. | Virtual machine system, networking device and monitoring method of virtual machine system |
WO2012125144A1 (en) * | 2011-03-11 | 2012-09-20 | Joyent, Inc. | Systems and methods for sizing resources in a cloud-based environment |
-
2013
- 2013-11-11 WO PCT/MY2013/000191 patent/WO2014073949A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1508855A2 (en) * | 2003-08-20 | 2005-02-23 | Katana Technology, Inc. | Method and apparatus for providing virtual computing services |
US20070226449A1 (en) * | 2006-03-22 | 2007-09-27 | Nec Corporation | Virtual computer system, and physical resource reconfiguration method and program thereof |
US20080304421A1 (en) | 2007-06-07 | 2008-12-11 | Microsoft Corporation | Internet Latencies Through Prediction Trees |
US20110231899A1 (en) | 2009-06-19 | 2011-09-22 | ServiceMesh Corporation | System and method for a cloud computing abstraction layer |
US20110307889A1 (en) * | 2010-06-11 | 2011-12-15 | Hitachi, Ltd. | Virtual machine system, networking device and monitoring method of virtual machine system |
WO2012125144A1 (en) * | 2011-03-11 | 2012-09-20 | Joyent, Inc. | Systems and methods for sizing resources in a cloud-based environment |
Non-Patent Citations (2)
Title |
---|
APOSTOL, BALUTA, GORGOI, CRISTEA: "Efficient manager for virtualized resource provisioning in cloud systems", INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 25 August 2011 (2011-08-25), Bucharest, pages 511 - 517, XP032063552 * |
GARG, SRINIVASA, GOPALAIYENGAR, BUYYA: "SLA-based resource provisioning for heterogeneous workloads in a virtualized cloud datacenter", ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, 1 January 2011 (2011-01-01), Melbourne, XP019168277 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105045667B (en) * | 2015-07-13 | 2018-11-30 | 中国科学院计算技术研究所 | A kind of resource pool management method for virtual machine vCPU scheduling |
CN105045667A (en) * | 2015-07-13 | 2015-11-11 | 中国科学院计算技术研究所 | Resource pool management method for vCPU scheduling of virtual machines |
US10270711B2 (en) | 2017-03-16 | 2019-04-23 | Red Hat, Inc. | Efficient cloud service capacity scaling |
CN109002342A (en) * | 2017-06-07 | 2018-12-14 | 中国科学院信息工程研究所 | A kind of computing resource orientation dispatching method and system based on OpenStack |
CN109002342B (en) * | 2017-06-07 | 2022-09-23 | 中国科学院信息工程研究所 | A method and system for oriented scheduling of computing resources based on OpenStack |
CN111782355B (en) * | 2020-06-03 | 2024-05-28 | 上海交通大学 | Cloud computing task scheduling method and system based on mixed load |
CN111782355A (en) * | 2020-06-03 | 2020-10-16 | 上海交通大学 | A cloud computing task scheduling method and system based on mixed load |
US11720425B1 (en) | 2021-05-20 | 2023-08-08 | Amazon Technologies, Inc. | Multi-tenant radio-based application pipeline processing system |
US11800404B1 (en) | 2021-05-20 | 2023-10-24 | Amazon Technologies, Inc. | Multi-tenant radio-based application pipeline processing server |
US12260271B2 (en) | 2021-05-20 | 2025-03-25 | Amazon Technologies, Inc. | Multi-tenant radio-based application pipeline processing system |
US11916999B1 (en) | 2021-06-30 | 2024-02-27 | Amazon Technologies, Inc. | Network traffic management at radio-based application pipeline processing servers |
US12236248B1 (en) | 2021-06-30 | 2025-02-25 | Amazon Technologies, Inc. | Transparent migration of radio-based applications |
WO2023192776A1 (en) * | 2022-03-31 | 2023-10-05 | Amazon Technologies, Inc. | Cloud-based orchestration of network functions |
US11985065B2 (en) | 2022-06-16 | 2024-05-14 | Amazon Technologies, Inc. | Enabling isolated virtual network configuration options for network function accelerators |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2014073949A1 (en) | A system and method for virtual machine reservation for delay sensitive service applications | |
Bauer et al. | Chameleon: A hybrid, proactive auto-scaling mechanism on a level-playing field | |
Gunasekaran et al. | Fifer: Tackling resource underutilization in the serverless era | |
Al-Ayyoub et al. | Multi-agent based dynamic resource provisioning and monitoring for cloud computing systems infrastructure | |
Maenhaut et al. | Resource management in a containerized cloud: Status and challenges | |
US9916135B2 (en) | Scaling a cloud infrastructure | |
Singh et al. | STAR: SLA-aware autonomic management of cloud resources | |
EP2615803B1 (en) | Performance interference model for managing consolidated workloads in QoS-aware clouds | |
Han et al. | Enabling cost-aware and adaptive elasticity of multi-tier cloud applications | |
KR101977726B1 (en) | APPARATUS AND METHOD FOR Virtual Desktop Services | |
EP3108619B1 (en) | Orchestration and management of services to deployed devices | |
Mahmoudi et al. | Performance modeling of metric-based serverless computing platforms | |
Lloyd et al. | Demystifying the clouds: Harnessing resource utilization models for cost effective infrastructure alternatives | |
Beltrán | BECloud: A new approach to analyse elasticity enablers of cloud services | |
Leena Sri et al. | An empirical model of adaptive cloud resource provisioning with speculation | |
Kumar et al. | Resource provisioning in cloud computing using prediction models: A survey | |
Jiang et al. | Resource allocation in contending virtualized environments through VM performance modeling and feedback | |
KR101295515B1 (en) | System and method for providing u-city service | |
Burakowski et al. | Traffic Management for Cloud Federation. | |
Wu et al. | Adaptive processing rate based container provisioning for meshed micro-services in kubernetes clouds | |
Behera et al. | Leveraging towards dynamic allocations of mist nodes for IoT-Mist-Fog-Cloud system using M/E r/1 queueing model | |
Jiang et al. | Resource allocation in contending virtualized environments through stochastic virtual machine performance modeling and feedback | |
Kübler et al. | Towards Cross-layer Monitoring of Cloud Workflows. | |
Barlaskar et al. | Using Docker Swarm with a user-centric decision-making framework for cloud application migration | |
Åsberg | Optimized Autoscaling of Cloud Native Applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13805619 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13805619 Country of ref document: EP Kind code of ref document: A1 |