[go: up one dir, main page]

WO2014073949A1 - A system and method for virtual machine reservation for delay sensitive service applications - Google Patents

A system and method for virtual machine reservation for delay sensitive service applications Download PDF

Info

Publication number
WO2014073949A1
WO2014073949A1 PCT/MY2013/000191 MY2013000191W WO2014073949A1 WO 2014073949 A1 WO2014073949 A1 WO 2014073949A1 MY 2013000191 W MY2013000191 W MY 2013000191W WO 2014073949 A1 WO2014073949 A1 WO 2014073949A1
Authority
WO
WIPO (PCT)
Prior art keywords
service
virtual machines
latency
virtual machine
module
Prior art date
Application number
PCT/MY2013/000191
Other languages
French (fr)
Inventor
Ping LIM BOON
Karuppiah ETTIKAN KANDASAMY
Kit CHONG POH
Original Assignee
Mimos Berhad
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mimos Berhad filed Critical Mimos Berhad
Publication of WO2014073949A1 publication Critical patent/WO2014073949A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources

Definitions

  • the present invention relates to a system and method for virtual machine reservation for delay sensitive service applications.
  • the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.
  • a request for bond price predictions may require intensive computation and the results may be required by a customer anywhere and at any time within seconds of the request.
  • the SaaS provider may not be able to predict the requirement of computer resources as the SaaS provided may not know when, where or how much computing resources are needed.
  • maximum numbers of virtual machines need to be deployed at dispersed locations which generally results in wasted resources. In particular, this may result in virtual machines standing idle when no service request is received.
  • United States Patent Publication No. 2011/0231899 describes a system that provides a cloud-computing service from a cloud-computing environment comprising a plurality of cloud-computing resources.
  • the system comprises a management module configured to manage a cloud-computing resource of the plurality of cloud-computing resources as a cloud-computing service.
  • the cloud-computing service performs a computer workload.
  • the system also comprises an adapter configured to connect to the cloud-computing resource to the system and translate a management instruction received from the management module into a proprietary cloud application program interface call for the cloud-computing resource.
  • a cloud service bus is provided that is configured to route the management instruction from the management module to the adapter and a consumption module is provided that is configured to allow a user to subscribe the cloud-computing service.
  • a planning module is provided that is configured to plan the cloud-computing service and a build module is provided that is configured to build the cloud-computing service from the cloud-computing resource and publish the cloud-computing service to the consumption module.
  • This publication exemplifies a system that involves reactive service provisioning in which the allocation of virtual machines is based on the computational workload assigned, and in which planning, management and spawning of virtual machines is based on service requests.
  • the scheduling policy described is not application aware, but focuses on virtual machine allocation based on hardware resource availability.
  • United States Patent Publication No. 2008/0304421 describes a prediction tree for estimating values of a network performance measure.
  • Leaf nodes of the prediction tree are associated with networked computing devices and interior nodes are not necessarily representative of physical network connections.
  • Values are assigned to edges in the prediction tree and the network performance measure relative to two computing devices represented by two nodes of the tree is estimated by aggregating the values assigned to the edges in the path in the prediction tree joining the two edges.
  • Mechanisms for adding nodes representing computing devices to the prediction tree, for identifying a closest node representing a computing device in the prediction tree, for identifying a cluster of devices represented by nodes of the tree, and for rebalancing the prediction tree are provided.
  • This publication exemplifies systems based on the well-known Euclidean Steiner Tree Model in combinatorial optimisation.
  • the input as exemplified in this publication relates to inter-nodal network performance measurements (i.e. network latency only).
  • Optimisation of the prediction tree does not refer to the mechanism behind network node selection.
  • the subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one exemplary technology area where some embodiments described herein may be practiced.
  • the present invention relates to a system and method for virtual machine reservation for delay sensitive service applications.
  • the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.
  • the present invention provides a system for virtual machine reservation for delay sensitive service applications.
  • the system comprising at least one servicing module configured to manage cloud computing service requests; at least one scheduling module configured to provide and deploy virtual machines for cloud computing services to fulfil the requests; at least one prediction module configured to predict service latency of unmeasured virtual machine resources; and at least one measurement module configured to trigger measurement of service latency of virtual machines which emulate the cloud computing service, characterised in that the scheduling module is configured to provide and deploys the virtual machines to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines.
  • the said scheduling module further comprises a scheduler configured to deploy virtual machines for service emulation, to reserve virtual machines according to a given policy defined by policy making module, and to shutdown virtual machines for resource optimisation.
  • the servicing module further comprises a service request handler configured to input a service configuration of a service type and/or a range of tolerable service response times for the service request; a planning module configured to identify a set of cloud computing services to be deployed on virtual machines for service latency computation; and a policy making module configured to receive the predicted service latency and the measured service latency, estimate total service response time and define policy to optimise virtual machine resources to be reserved.
  • the planning module further comprises a task categorisation module configured to classify tasks required to satisfy the service request; and a task provisioning module configured to identify at least one virtual machine available and required to satisfy the service by forming a service performance zone based on the pre-defined service response time.
  • the policy making module further comprises at least one appointed host a plurality of virtual machines to be reserved at the appointed host; a plurality of CPU resources to be reserved at the appointed host; and a plurality of memory resources to be reserved at the appointed host.
  • the prediction module further comprises an estimation module configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s); and a tree construction module configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources.
  • the measurement module comprises a controller module configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to the prediction module; and a repository handler module configured to retrieve historical service latency data for selected virtual machines and feedback to the prediction module.
  • the invention provides a method for virtual machine reservation for delay sensitive service applications comprising receiving a service request from a client network (410);providing cloud computing resources requested (420); instantiating virtual machines service (430); emulating the service request and collecting machine service latency (440); predicting service latency of unmeasured virtual machines (450);forming at least one servicing performance zone based on the service latency of virtual machines (460); and determining virtual machine resources to be reserved (470), characterised in that the virtual machines are reserved to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines further comprises steps of.
  • providing cloud computing resources comprises identifying service type and/or required range of service response time (421); determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422); receiving predicted service latency and measured service latency information (423); estimating total service response time (424); and providing a virtual machine resource to be reserved (425).
  • emulating the service request comprises triggering a set of cloud computing services on at least one selected virtual machine (441); measuring service latency from the selected virtual machine(s) (442); and feedback of the service latency (443).
  • predicting service latency of unmeasured virtual machines comprises selecting at least two virtual machines (451); receiving at least one service latency measurement (452); constructing at least one prediction tree (453); and predicting service latency of unmeasured virtual machines (454).
  • forming at least one servicing performance zone comprises retrieving service latency information (461); identifying a range of service response time (462); identifying response intervals for servicing performance zone(s) (463); and forming the performance service zone(s) on the prediction tree based on the range of service response time (464).
  • calculating the virtual resource required to fulfil the pre-defined service response time comprises determining at least one appointed host (476); determining the number of virtual machines to be reserved at the appointed host (477); determining the number of CPU resources to be reserved at the appointed host (478); and determining the number of memory resources to be reserved at the appointed host (479).
  • FIG. 1.0 illustrates the system of an embodiment of the invention.
  • FIG. 2.0 illustrates the servicing module of the system of Figure 1.0 in more detail.
  • FIG. 3.0 illustrates the protection module, measurement module and scheduling module of the system of Figure 1.0 in more detail.
  • FIG. 4.0 illustrates a flow diagram of the method of an embodiment of the invention.
  • FIG. 5.0 illustrates step 2 of the flow diagram of Figure 4 in more detail.
  • FIG. 6.0 illustrates step 4 of the flow diagram of Figure 4 in more detail.
  • FIG. 7.0 illustrates step 5 of the flow diagram of Figure 4 in more detail.
  • FIG. 8.0 illustrates step 6 of the flow diagram of Figure 4 in more detail.
  • FIG. 9.0 illustrates step 7 of the flow diagram of Figure 4 in more detail.
  • FIG. 10.0 illustrates step 7.3 of the flow diagram of Figure 9 in more detail.
  • FIG. 11.0 illustrates diagrammatically instantiation of the virtual machine service.
  • FIG. 12 illustrates diagrammatically emulation and execution of the service request and prediction of service latency.
  • Figure 13 illustrates diagrammatically the formation of servicing performance zones.
  • Figure 14 illustrates diagrammatically the determination and optimisation of resource reservation.
  • the present invention provides a system and method for virtual machine reservation for delay sensitive service applications.
  • the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.
  • the system (100) includes a servicing module (110) that is adapted to receive and manage a service request (120) from a client network.
  • a scheduling module (130) is in communication with the servicing module (110) and facilitates the provision and deployment of virtual machines in order to fulfil the service requests (120).
  • a prediction module (140) is provided and is configured to predict service latency of unmeasured virtual machine resources and communicate estimated service delay to the servicing module (110).
  • a measurement module (150) is also provided. This is configured to trigger measurement of service latency of virtual machines which emulate the cloud computing service.
  • the servicing module (110) includes a service request handler (112) configured to input a service configuration for the service request.
  • the service configuration may comprise a service type and/or a range of tolerable service response times.
  • the servicing module includes a planning module (114) which is configured to identify a set of cloud computing services to be deployed on virtual machines for service latency computation.
  • a policy making module (116) is also includes and is configured to receive the predicted service latency and measured service latency. The policy making module (116) also estimates total service response time and defines policy to optimise virtual machine resources to be reserved.
  • the planning module (1 14) includes a task categorisation module (118) that is configured to classify tasks required to satisfy the service request that has been made by the network client. It also includes a task provisioning module (1 19) that is configured to identify at least one virtual machine available and required to satisfy the service by forming a service performance zone based on a pre-defined service response time. This will be discussed in more detail below.
  • the scheduling module (130) comprises a scheduler (132) that is configured to deploy virtual machines for service emulation.
  • the scheduler (132) is adapted to reserve virtual machines according to a given policy defined by the policy making module (1 16), and to shutdown virtual machines for resource optimisation.
  • the prediction module (140) generally includes an estimation module (142) and a tree construction module (144).
  • the estimation module (142) is configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s), while the tree construction module (144) is configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources.
  • the measurement module (150) includes a controller module (152) and a repository handler module (154).
  • the controller module ( 52) is configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to the prediction module (140), while the repository handler module (154) is configured to retrieve historical service latency data for selected virtual machines and feedback to the prediction module (140).
  • the invention includes the steps of receiving a service request from a client network (410), providing cloud computing resources requested (420), instantiating virtual machines service (430), emulating the service request and collecting machine service latency (440), predicting service latency of unmeasured virtual machines (450), forming at least one servicing performance zone based on the service latency of virtual machines (460) and determining virtual machine resources to be reserved (470).
  • the step of providing cloud computing resources (420) includes the steps of identifying service type and/or required range of service response time (421), determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422), receiving predicted service latency and measured service latency information (423), estimating total service response time (424) and providing a virtual machine resource to be reserved.
  • the step of emulating the service request (440) includes triggering a set of cloud computing services on at least one selected virtual machine (441), measuring service latency from the selected virtual machine(s) (442), and feedback of the service latency (443).
  • the step of predicting service latency of unmeasured virtual machines (450) includes selecting at least two virtual machines (451), receiving at least one service latency measurement (452), constructing at least one prediction tree (453) and predicting service latency of unmeasured virtual machines (454).
  • the step of forming at least one servicing performance zone (460) includes retrieving service latency information (461), identifying a range of service response time (462), identifying response intervals for servicing performance zone(s) (463) and forming the performance service zone(s) on the prediction tree based on the range of service response time (464).
  • the step of determining virtual machine resources to be reserved includes identifying the number of virtual machines available in each servicing performance zone (471), determining service response times for each servicing performance zone (472), calculating the virtual resource required to fulfil the pre-defined service response time (473) and, if the pre-defined service response time is not fulfilled by the service response time within the servicing performance zone, deploying additional virtual machines (474), or, if the service response time within the servicing performance zone outperforms the pre-defined service response time, shutting down virtual machine(s) (475).
  • Calculating the virtual resource required to fulfil the pre-defined service response time includes determining at least one appointed host (476), determining the number of virtual machines to be reserved at the appointed host (477), determining the number of CPU resources to be reserved at the appointed host (478); and determining the number of memory resources to be reserved at the appointed host.
  • Figure 11 illustrates diagrammatically instantiation of the virtual machine service 430. As noted, this step involves identification of an application server, request for virtual machines and subsequent instantiation of the virtual machines. Emulation and execution of the service request 440 and prediction of service latency (450) are diagrammatically illustrated in Figure 12, while Figure 13 provides a diagrammatic representation of the formation of servicing performance zones (460). Finally, Figure 14 illustrates determination and optimisation of resource reservation (470).

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention provides a system (100) including a servicing module (110) that is adapted to receive and manage a service request (120) from a client network. A scheduling module (130) is in communication with the servicing module (110) and facilitates the provision and deployment of virtual machines in order to fulfil the service requests (120). A prediction module (140) is provided and is configured to predict service latency of unmeasured virtual machine resources and communicate estimated service delay to the servicing module (110). A measurement module (150) is also provided. This is configured to measure service latency of virtual machines which emulate the cloud computing service. The scheduling module (130) is configured to provide and deploys said virtual machines to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines.

Description

A SYSTEM AND METHOD FOR VIRTUAL MACHINE RESERVATION FOR DELAY
SENSITIVE SERVICE APPLICATIONS
FIELD OF INVENTION
The present invention relates to a system and method for virtual machine reservation for delay sensitive service applications. In particular, the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.
BACKGROUND ART
Current systems and methods implement reservation of virtual machines solely on cloud resource availability. Generally, virtual machine resources are maximised to assure service response time which results in ongoing wasted resources. More particularly, in current systems cloud software as a service (SaaS) providers need to place maximum cloud infrastructure in advance to ensure smooth service operation. SaaS providers are therefore generally required to provide expected virtual machine requests to ensure infrastructures are deployed and placed on standby for service. Lease requests are generally made on the basis of number of hosts, number of CPUs, amount of memory required and time. For example, a client may advise they need 10 nodes, each with 2 CPUs, 4GB of memory, from 2pm to 4pm every day. Lease request method is unsuitable for cloud service providers that provide delay sensitive services. For example, a request for bond price predictions may require intensive computation and the results may be required by a customer anywhere and at any time within seconds of the request. As such, the SaaS provider may not be able to predict the requirement of computer resources as the SaaS provided may not know when, where or how much computing resources are needed. In order to fulfil service level agreements, maximum numbers of virtual machines need to be deployed at dispersed locations which generally results in wasted resources. In particular, this may result in virtual machines standing idle when no service request is received. United States Patent Publication No. 2011/0231899 describes a system that provides a cloud-computing service from a cloud-computing environment comprising a plurality of cloud-computing resources. In certain embodiments, the system comprises a management module configured to manage a cloud-computing resource of the plurality of cloud-computing resources as a cloud-computing service. Generally, the cloud-computing service performs a computer workload. The system also comprises an adapter configured to connect to the cloud-computing resource to the system and translate a management instruction received from the management module into a proprietary cloud application program interface call for the cloud-computing resource. A cloud service bus is provided that is configured to route the management instruction from the management module to the adapter and a consumption module is provided that is configured to allow a user to subscribe the cloud-computing service. Finally, a planning module is provided that is configured to plan the cloud-computing service and a build module is provided that is configured to build the cloud-computing service from the cloud-computing resource and publish the cloud-computing service to the consumption module.
This publication exemplifies a system that involves reactive service provisioning in which the allocation of virtual machines is based on the computational workload assigned, and in which planning, management and spawning of virtual machines is based on service requests. The scheduling policy described is not application aware, but focuses on virtual machine allocation based on hardware resource availability.
United States Patent Publication No. 2008/0304421 describes a prediction tree for estimating values of a network performance measure. Leaf nodes of the prediction tree are associated with networked computing devices and interior nodes are not necessarily representative of physical network connections. Values are assigned to edges in the prediction tree and the network performance measure relative to two computing devices represented by two nodes of the tree is estimated by aggregating the values assigned to the edges in the path in the prediction tree joining the two edges. Mechanisms for adding nodes representing computing devices to the prediction tree, for identifying a closest node representing a computing device in the prediction tree, for identifying a cluster of devices represented by nodes of the tree, and for rebalancing the prediction tree are provided.
This publication exemplifies systems based on the well-known Euclidean Steiner Tree Model in combinatorial optimisation. The input as exemplified in this publication relates to inter-nodal network performance measurements (i.e. network latency only). Optimisation of the prediction tree does not refer to the mechanism behind network node selection. The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one exemplary technology area where some embodiments described herein may be practiced.
SUMMARY OF INVENTION
The present invention relates to a system and method for virtual machine reservation for delay sensitive service applications. In particular, the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.
One aspect of the present invention provides a system for virtual machine reservation for delay sensitive service applications. The system comprising at least one servicing module configured to manage cloud computing service requests; at least one scheduling module configured to provide and deploy virtual machines for cloud computing services to fulfil the requests; at least one prediction module configured to predict service latency of unmeasured virtual machine resources; and at least one measurement module configured to trigger measurement of service latency of virtual machines which emulate the cloud computing service, characterised in that the scheduling module is configured to provide and deploys the virtual machines to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines. The said scheduling module further comprises a scheduler configured to deploy virtual machines for service emulation, to reserve virtual machines according to a given policy defined by policy making module, and to shutdown virtual machines for resource optimisation.
In another aspect the invention provides a system wherein the servicing module further comprises a service request handler configured to input a service configuration of a service type and/or a range of tolerable service response times for the service request; a planning module configured to identify a set of cloud computing services to be deployed on virtual machines for service latency computation; and a policy making module configured to receive the predicted service latency and the measured service latency, estimate total service response time and define policy to optimise virtual machine resources to be reserved. In yet another aspect of the invention there is provided a system wherein the planning module further comprises a task categorisation module configured to classify tasks required to satisfy the service request; and a task provisioning module configured to identify at least one virtual machine available and required to satisfy the service by forming a service performance zone based on the pre-defined service response time.
In still another aspect of the invention there is provided a system wherein the policy making module further comprises at least one appointed host a plurality of virtual machines to be reserved at the appointed host; a plurality of CPU resources to be reserved at the appointed host; and a plurality of memory resources to be reserved at the appointed host.
In a further aspect of the invention there is provided a system wherein the prediction module further comprises an estimation module configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s); and a tree construction module configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources.
In another aspect of the invention there is provided a system wherein the measurement module comprises a controller module configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to the prediction module; and a repository handler module configured to retrieve historical service latency data for selected virtual machines and feedback to the prediction module.
In another aspect the invention provides a method for virtual machine reservation for delay sensitive service applications comprising receiving a service request from a client network (410);providing cloud computing resources requested (420); instantiating virtual machines service (430); emulating the service request and collecting machine service latency (440); predicting service latency of unmeasured virtual machines (450);forming at least one servicing performance zone based on the service latency of virtual machines (460); and determining virtual machine resources to be reserved (470), characterised in that the virtual machines are reserved to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines further comprises steps of. identifying number of virtual machines available in each servicing performance zone (471); determining service response times for each servicing performance zone (472); calculating the virtual resource required to fulfil the pre-defined service response time (473); and deploying additional virtual machines if the pre-defined service response time is not fulfilled by the service response time within the servicing performance zone (474); and shutting down virtual machine (s) if the service response time within the servicing performance zone outperforms the pre-defined service response time (475).
In a further aspect of the invention there is provided a method wherein providing cloud computing resources comprises identifying service type and/or required range of service response time (421); determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422); receiving predicted service latency and measured service latency information (423); estimating total service response time (424); and providing a virtual machine resource to be reserved (425).
In still a further aspect of the invention there is provided a method wherein emulating the service request comprises triggering a set of cloud computing services on at least one selected virtual machine (441); measuring service latency from the selected virtual machine(s) (442); and feedback of the service latency (443).
In yet another aspect of the invention there is provided a method wherein predicting service latency of unmeasured virtual machines comprises selecting at least two virtual machines (451); receiving at least one service latency measurement (452); constructing at least one prediction tree (453); and predicting service latency of unmeasured virtual machines (454).
In another aspect of the invention there is provided a method wherein forming at least one servicing performance zone comprises retrieving service latency information (461); identifying a range of service response time (462); identifying response intervals for servicing performance zone(s) (463); and forming the performance service zone(s) on the prediction tree based on the range of service response time (464). In yet another aspect the invention provides a method wherein calculating the virtual resource required to fulfil the pre-defined service response time comprises determining at least one appointed host (476); determining the number of virtual machines to be reserved at the appointed host (477); determining the number of CPU resources to be reserved at the appointed host (478); and determining the number of memory resources to be reserved at the appointed host (479).
The present invention consists of features and a combination of parts hereinafter fully described and illustrated in the accompanying drawings, it being understood that various changes in the details may be made without departing from the scope of the invention or sacrificing any of the advantages of the present invention.
BRIEF DESCRIPTION OF ACCOMPANYING DRAWINGS
To further clarify various aspects of some embodiments of the present invention, a more particular description of the invention will be rendered by references to specific embodiments thereof, which are illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail through the accompanying drawings in which: FIG. 1.0 illustrates the system of an embodiment of the invention.
FIG. 2.0 illustrates the servicing module of the system of Figure 1.0 in more detail.
FIG. 3.0 illustrates the protection module, measurement module and scheduling module of the system of Figure 1.0 in more detail.
FIG. 4.0 illustrates a flow diagram of the method of an embodiment of the invention.
FIG. 5.0 illustrates step 2 of the flow diagram of Figure 4 in more detail.
FIG. 6.0 illustrates step 4 of the flow diagram of Figure 4 in more detail. FIG. 7.0 illustrates step 5 of the flow diagram of Figure 4 in more detail. FIG. 8.0 illustrates step 6 of the flow diagram of Figure 4 in more detail. FIG. 9.0 illustrates step 7 of the flow diagram of Figure 4 in more detail. FIG. 10.0 illustrates step 7.3 of the flow diagram of Figure 9 in more detail.
FIG. 11.0 illustrates diagrammatically instantiation of the virtual machine service. FIG. 12 illustrates diagrammatically emulation and execution of the service request and prediction of service latency.
Figure 13 illustrates diagrammatically the formation of servicing performance zones.
Figure 14 illustrates diagrammatically the determination and optimisation of resource reservation.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The present invention provides a system and method for virtual machine reservation for delay sensitive service applications. In particular, the invention relates to systems and methods that leverage on service latency to predict requisite numbers of virtual machines requiring deployment to assure a specified service response time.
Hereinafter, this specification will describe the present invention according to the preferred embodiments. It is to be understood that limiting the description to the preferred embodiments of the invention is merely to facilitate discussion of the present invention and it is envisioned without departing from the scope of the appended claims.
Referring to Figure 1.0, the system (100) according to an embodiment of the invention is illustrated. The system (100) includes a servicing module (110) that is adapted to receive and manage a service request (120) from a client network. A scheduling module (130) is in communication with the servicing module (110) and facilitates the provision and deployment of virtual machines in order to fulfil the service requests (120). A prediction module (140) is provided and is configured to predict service latency of unmeasured virtual machine resources and communicate estimated service delay to the servicing module (110). A measurement module (150) is also provided. This is configured to trigger measurement of service latency of virtual machines which emulate the cloud computing service.
A more detailed description of the servicing module (110) may be seen with reference to Figure 2.0. More particularly, the servicing module (110) includes a service request handler (112) configured to input a service configuration for the service request. The service configuration may comprise a service type and/or a range of tolerable service response times. In addition, the servicing module includes a planning module (114) which is configured to identify a set of cloud computing services to be deployed on virtual machines for service latency computation. A policy making module (116) is also includes and is configured to receive the predicted service latency and measured service latency. The policy making module (116) also estimates total service response time and defines policy to optimise virtual machine resources to be reserved. The planning module (1 14) includes a task categorisation module (118) that is configured to classify tasks required to satisfy the service request that has been made by the network client. It also includes a task provisioning module (1 19) that is configured to identify at least one virtual machine available and required to satisfy the service by forming a service performance zone based on a pre-defined service response time. This will be discussed in more detail below.
Referring to Figure 3, the scheduling module (130) comprises a scheduler (132) that is configured to deploy virtual machines for service emulation. The scheduler (132) is adapted to reserve virtual machines according to a given policy defined by the policy making module (1 16), and to shutdown virtual machines for resource optimisation.
The prediction module (140) generally includes an estimation module (142) and a tree construction module (144). The estimation module (142) is configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s), while the tree construction module (144) is configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources. The measurement module (150) includes a controller module (152) and a repository handler module (154). The controller module ( 52) is configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to the prediction module (140), while the repository handler module (154) is configured to retrieve historical service latency data for selected virtual machines and feedback to the prediction module (140).
Referring to Figures 4 through 8, an embodiment of the method (400) of the invention is illustrated. Generally, the invention includes the steps of receiving a service request from a client network (410), providing cloud computing resources requested (420), instantiating virtual machines service (430), emulating the service request and collecting machine service latency (440), predicting service latency of unmeasured virtual machines (450), forming at least one servicing performance zone based on the service latency of virtual machines (460) and determining virtual machine resources to be reserved (470).
The step of providing cloud computing resources (420) includes the steps of identifying service type and/or required range of service response time (421), determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422), receiving predicted service latency and measured service latency information (423), estimating total service response time (424) and providing a virtual machine resource to be reserved.
The step of emulating the service request (440) includes triggering a set of cloud computing services on at least one selected virtual machine (441), measuring service latency from the selected virtual machine(s) (442), and feedback of the service latency (443).
The step of predicting service latency of unmeasured virtual machines (450) includes selecting at least two virtual machines (451), receiving at least one service latency measurement (452), constructing at least one prediction tree (453) and predicting service latency of unmeasured virtual machines (454).
The step of forming at least one servicing performance zone (460) includes retrieving service latency information (461), identifying a range of service response time (462), identifying response intervals for servicing performance zone(s) (463) and forming the performance service zone(s) on the prediction tree based on the range of service response time (464).
The step of determining virtual machine resources to be reserved (470) includes identifying the number of virtual machines available in each servicing performance zone (471), determining service response times for each servicing performance zone (472), calculating the virtual resource required to fulfil the pre-defined service response time (473) and, if the pre-defined service response time is not fulfilled by the service response time within the servicing performance zone, deploying additional virtual machines (474), or, if the service response time within the servicing performance zone outperforms the pre-defined service response time, shutting down virtual machine(s) (475). Calculating the virtual resource required to fulfil the pre-defined service response time (473) includes determining at least one appointed host (476), determining the number of virtual machines to be reserved at the appointed host (477), determining the number of CPU resources to be reserved at the appointed host (478); and determining the number of memory resources to be reserved at the appointed host.
Turning to Figures 1 1 through 14, alternative representations of various steps of an embodiment of the method of the invention are provided. In particular, Figure 11 illustrates diagrammatically instantiation of the virtual machine service 430. As noted, this step involves identification of an application server, request for virtual machines and subsequent instantiation of the virtual machines. Emulation and execution of the service request 440 and prediction of service latency (450) are diagrammatically illustrated in Figure 12, while Figure 13 provides a diagrammatic representation of the formation of servicing performance zones (460). Finally, Figure 14 illustrates determination and optimisation of resource reservation (470).
Unless the context requires otherwise or specifically stated to the contrary, integers, steps or elements of the invention recited herein as singular integers, steps or elements clearly encompass both singular and plural forms of the recited integers, steps or elements.
Throughout this specification, unless the context requires otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated step or element or integer or group of steps or elements or integers, but not the exclusion of any other step or element or integer or group of steps, elements or integers. Thus, in the context of this specification, the term "comprising" is used in an inclusive sense and thus should be understood as meaning "including principally, but not necessarily solely".
It will be appreciated that the foregoing description has been given by way of illustrative example of the invention and that all such modifications and variations thereto as would be apparent to persons of skill in the art are deemed to fall within the broad scope and ambit of the invention as herein set forth.

Claims

1. A system for virtual machine reservation for delay sensitive service applications, the system comprising:
at least one servicing module configured to manage cloud computing service requests;
at least one scheduling module configured to provide and deploy virtual machines for cloud computing services to fulfil said requests; at least one prediction module configured to predict service latency of unmeasured virtual machine resources; and
at least one measurement module configured to measure service latency of virtual machines which emulate said cloud computing service, characterised in that said scheduling module is configured to provide and deploys said virtual machines to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines; said scheduling module further comprises a scheduler configured to deploy virtual machines for service emulation, to reserve virtual machines according to a given policy defined by policy making module, and to shutdown virtual machines for resource optimisation.
2. A system according to claim 1 , wherein said servicing module further comprises:
a service request handler configured to input a service configuration of a service type and/or a range of tolerable service response times for said service request;
a planning module configured to identify a set of cloud computing services to be deployed on virtual machines for service latency
computation; and
a policy making module configured to receive said predicted service latency and said measured service latency, estimate total service response time and define policy to optimise virtual machine resources to be reserved.
A system according to claim 2 , wherein said planning module further comprises: a task categorisation module configured to classify tasks required to satisfy said service request; and
a task provisioning module configured to identify at least one virtual machine available and required to satisfy said service by forming a service performance zone based on said pre-defined service response time.
4. A system according to claim 2 , wherein said policy making module further comprises:
at least one appointed host;
a plurality of virtual machines to be reserved at said appointed host;
a plurality of CPU resources to be reserved at said appointed host; and a plurality of memory resources to be reserved at said appointed host.
5. A system according to claim 1 , wherein said prediction module further comprises:
an estimation module configured to select at least one virtual machine for service latency measurement and receive obtained service latency measurement(s); and
a tree construction module configured to construct at least one prediction tree and predict service latency of unmeasured virtual machine resources.
A system according to claim 1 , wherein said measurement module comprises: a controller module configured to request service emulation on virtual machines, trigger service latency measurement on selected virtual machines, receive measured service latency and feedback to said prediction module; and
a repository handler module configured to retrieve historical service latency data for selected virtual machines and feedback to said prediction module.
7. A method for virtual machine reservation for delay sensitive service applications, the method comprising steps of:
receiving service request from client network (410);
providing cloud computing resources requested (420) by identifying service type and/or required range of service response time (421);
determining a set of cloud computing services to be deployed on at least one virtual machine for service latency computation (422); receiving predicted service latency and measured service latency information (423); estimating total service response time (424); and providing a virtual machine resource to be reserved (425);
instantiating virtual machines service (430);
emulating said service request and collecting machine service latency (440) by triggering a set of cloud computing services on at least one selected virtual machine (441); measuring service latency from said selected virtual machine(s) (442); and
providing feedback of said service latency (443);
predicting service latency of unmeasured virtual machines (450);
forming at least one servicing performance zone based on said service latency of virtual machines (460); and
determining virtual machine resources to be reserved (470) characterised in that determining virtual machine resources to be reserved to satisfy a pre-defined service response time based on predicted service latency of unmeasured virtual machine resources and measured service latency of virtual machines further comprises steps of: identifying number of virtual machines available in each servicing performance zone (471);
determining service response times for each servicing
performance zone (472);
calculating virtual resource required to fulfil said pre-defined service response time (473);
deploying additional virtual machines if said pre-defined service response time is not fulfilled by said service response time within said servicing performance zone (474); and shutting down virtual machine(s) if said service response time within said servicing performance zone outperforms said predefined service response time (475).
A method according to claim 7, wherein predicting service latency of unmeasured virtual machines further comprises steps of:
selecting at least two virtual machines (451);
receiving at least one service latency measurement (452);
constructing at least one prediction tree (453); and
predicting service latency of unmeasured virtual machines (454).
A method according to claim 7, wherein forming at least one servicing performance zone further comprises steps of:
retrieving service latency information (461);
identifying a range of service response time (462);
identifying response intervals for servicing performance zone(s) (463); and
forming said performance service zone(s) on said prediction tree based on said range of service response time (464).
A method according to claim 7, wherein calculating the virtual resource required to fulfil said pre-defined service response time further comprises steps of:
determining at least one appointed host (476);
determining the number of virtual machines to be reserved at said appointed host (477);
determining the number of CPU resources to be reserved at said appointed host (478); and
determining the number of memory resources to be reserved at said appointed host (479).
PCT/MY2013/000191 2012-11-12 2013-11-11 A system and method for virtual machine reservation for delay sensitive service applications WO2014073949A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI2012004922 2012-11-12
MYPI2012004922 2012-11-12

Publications (1)

Publication Number Publication Date
WO2014073949A1 true WO2014073949A1 (en) 2014-05-15

Family

ID=49765632

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2013/000191 WO2014073949A1 (en) 2012-11-12 2013-11-11 A system and method for virtual machine reservation for delay sensitive service applications

Country Status (1)

Country Link
WO (1) WO2014073949A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045667A (en) * 2015-07-13 2015-11-11 中国科学院计算技术研究所 Resource pool management method for vCPU scheduling of virtual machines
CN109002342A (en) * 2017-06-07 2018-12-14 中国科学院信息工程研究所 A kind of computing resource orientation dispatching method and system based on OpenStack
US10270711B2 (en) 2017-03-16 2019-04-23 Red Hat, Inc. Efficient cloud service capacity scaling
CN111782355A (en) * 2020-06-03 2020-10-16 上海交通大学 A cloud computing task scheduling method and system based on mixed load
US11720425B1 (en) 2021-05-20 2023-08-08 Amazon Technologies, Inc. Multi-tenant radio-based application pipeline processing system
WO2023192776A1 (en) * 2022-03-31 2023-10-05 Amazon Technologies, Inc. Cloud-based orchestration of network functions
US11800404B1 (en) 2021-05-20 2023-10-24 Amazon Technologies, Inc. Multi-tenant radio-based application pipeline processing server
US11916999B1 (en) 2021-06-30 2024-02-27 Amazon Technologies, Inc. Network traffic management at radio-based application pipeline processing servers
US11985065B2 (en) 2022-06-16 2024-05-14 Amazon Technologies, Inc. Enabling isolated virtual network configuration options for network function accelerators
US12236248B1 (en) 2021-06-30 2025-02-25 Amazon Technologies, Inc. Transparent migration of radio-based applications

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1508855A2 (en) * 2003-08-20 2005-02-23 Katana Technology, Inc. Method and apparatus for providing virtual computing services
US20070226449A1 (en) * 2006-03-22 2007-09-27 Nec Corporation Virtual computer system, and physical resource reconfiguration method and program thereof
US20080304421A1 (en) 2007-06-07 2008-12-11 Microsoft Corporation Internet Latencies Through Prediction Trees
US20110231899A1 (en) 2009-06-19 2011-09-22 ServiceMesh Corporation System and method for a cloud computing abstraction layer
US20110307889A1 (en) * 2010-06-11 2011-12-15 Hitachi, Ltd. Virtual machine system, networking device and monitoring method of virtual machine system
WO2012125144A1 (en) * 2011-03-11 2012-09-20 Joyent, Inc. Systems and methods for sizing resources in a cloud-based environment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1508855A2 (en) * 2003-08-20 2005-02-23 Katana Technology, Inc. Method and apparatus for providing virtual computing services
US20070226449A1 (en) * 2006-03-22 2007-09-27 Nec Corporation Virtual computer system, and physical resource reconfiguration method and program thereof
US20080304421A1 (en) 2007-06-07 2008-12-11 Microsoft Corporation Internet Latencies Through Prediction Trees
US20110231899A1 (en) 2009-06-19 2011-09-22 ServiceMesh Corporation System and method for a cloud computing abstraction layer
US20110307889A1 (en) * 2010-06-11 2011-12-15 Hitachi, Ltd. Virtual machine system, networking device and monitoring method of virtual machine system
WO2012125144A1 (en) * 2011-03-11 2012-09-20 Joyent, Inc. Systems and methods for sizing resources in a cloud-based environment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
APOSTOL, BALUTA, GORGOI, CRISTEA: "Efficient manager for virtualized resource provisioning in cloud systems", INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 25 August 2011 (2011-08-25), Bucharest, pages 511 - 517, XP032063552 *
GARG, SRINIVASA, GOPALAIYENGAR, BUYYA: "SLA-based resource provisioning for heterogeneous workloads in a virtualized cloud datacenter", ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, 1 January 2011 (2011-01-01), Melbourne, XP019168277 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045667B (en) * 2015-07-13 2018-11-30 中国科学院计算技术研究所 A kind of resource pool management method for virtual machine vCPU scheduling
CN105045667A (en) * 2015-07-13 2015-11-11 中国科学院计算技术研究所 Resource pool management method for vCPU scheduling of virtual machines
US10270711B2 (en) 2017-03-16 2019-04-23 Red Hat, Inc. Efficient cloud service capacity scaling
CN109002342A (en) * 2017-06-07 2018-12-14 中国科学院信息工程研究所 A kind of computing resource orientation dispatching method and system based on OpenStack
CN109002342B (en) * 2017-06-07 2022-09-23 中国科学院信息工程研究所 A method and system for oriented scheduling of computing resources based on OpenStack
CN111782355B (en) * 2020-06-03 2024-05-28 上海交通大学 Cloud computing task scheduling method and system based on mixed load
CN111782355A (en) * 2020-06-03 2020-10-16 上海交通大学 A cloud computing task scheduling method and system based on mixed load
US11720425B1 (en) 2021-05-20 2023-08-08 Amazon Technologies, Inc. Multi-tenant radio-based application pipeline processing system
US11800404B1 (en) 2021-05-20 2023-10-24 Amazon Technologies, Inc. Multi-tenant radio-based application pipeline processing server
US12260271B2 (en) 2021-05-20 2025-03-25 Amazon Technologies, Inc. Multi-tenant radio-based application pipeline processing system
US11916999B1 (en) 2021-06-30 2024-02-27 Amazon Technologies, Inc. Network traffic management at radio-based application pipeline processing servers
US12236248B1 (en) 2021-06-30 2025-02-25 Amazon Technologies, Inc. Transparent migration of radio-based applications
WO2023192776A1 (en) * 2022-03-31 2023-10-05 Amazon Technologies, Inc. Cloud-based orchestration of network functions
US11985065B2 (en) 2022-06-16 2024-05-14 Amazon Technologies, Inc. Enabling isolated virtual network configuration options for network function accelerators

Similar Documents

Publication Publication Date Title
WO2014073949A1 (en) A system and method for virtual machine reservation for delay sensitive service applications
Bauer et al. Chameleon: A hybrid, proactive auto-scaling mechanism on a level-playing field
Gunasekaran et al. Fifer: Tackling resource underutilization in the serverless era
Al-Ayyoub et al. Multi-agent based dynamic resource provisioning and monitoring for cloud computing systems infrastructure
Maenhaut et al. Resource management in a containerized cloud: Status and challenges
US9916135B2 (en) Scaling a cloud infrastructure
Singh et al. STAR: SLA-aware autonomic management of cloud resources
EP2615803B1 (en) Performance interference model for managing consolidated workloads in QoS-aware clouds
Han et al. Enabling cost-aware and adaptive elasticity of multi-tier cloud applications
KR101977726B1 (en) APPARATUS AND METHOD FOR Virtual Desktop Services
EP3108619B1 (en) Orchestration and management of services to deployed devices
Mahmoudi et al. Performance modeling of metric-based serverless computing platforms
Lloyd et al. Demystifying the clouds: Harnessing resource utilization models for cost effective infrastructure alternatives
Beltrán BECloud: A new approach to analyse elasticity enablers of cloud services
Leena Sri et al. An empirical model of adaptive cloud resource provisioning with speculation
Kumar et al. Resource provisioning in cloud computing using prediction models: A survey
Jiang et al. Resource allocation in contending virtualized environments through VM performance modeling and feedback
KR101295515B1 (en) System and method for providing u-city service
Burakowski et al. Traffic Management for Cloud Federation.
Wu et al. Adaptive processing rate based container provisioning for meshed micro-services in kubernetes clouds
Behera et al. Leveraging towards dynamic allocations of mist nodes for IoT-Mist-Fog-Cloud system using M/E r/1 queueing model
Jiang et al. Resource allocation in contending virtualized environments through stochastic virtual machine performance modeling and feedback
Kübler et al. Towards Cross-layer Monitoring of Cloud Workflows.
Barlaskar et al. Using Docker Swarm with a user-centric decision-making framework for cloud application migration
Åsberg Optimized Autoscaling of Cloud Native Applications

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13805619

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13805619

Country of ref document: EP

Kind code of ref document: A1