A kind of distributed caching dynamic retractility method and system of holding load equilibrium
Technical field
The present invention relates to a kind of distributed caching dynamic retractility method and system thereof, relate in particular to a kind of distributed caching dynamic retractility method and system of holding load equilibrium, belong to software technology field.
Background technology
Under the cloud computing environment, explosive growth has appearred in number of users and network traffics, how on cheap, standardized hardware and software platform, big capacity, professional transaction of closing of bonding is used and is provided good supporting to become the problem that numerous enterprises faces.Usually the service end bottleneck can appear at database, and in order further to address this problem, the distributed caching technology is introduced.Distributed caching furthered the cluster object data and use between distance, be the expedited data visit, provide data distributed shared key technology, this technology has important effect for the extended capability, the safeguards system reliability that improve system.
Forrester points out that in technical report in 2010 scalability is one of key characteristic of distributed caching system.The present invention is with flexible two classes that are divided into of caching system: static flexible and dynamic retractility.Static stretching when referring to system's increase and decrease cache node need stop the system of current operation, upgrades configuration information according to the situation of node increase and decrease, restarts whole system then.Dynamic retractility then is a kind of online flexible, and system can finish the interpolation of cache node and deletion, data cached migration and the renewal of configuration information automatically according to the variation of load, is self-management and the self that carries out under the artificial situation about participating in not having.The tradition caching system is main purpose to improve data access speed, is applied to page cache mostly, handles buffer memory and data object buffer memory.The current cache system also is used for the store status data except accelerating application access, high available support is provided, and therefore the lasting availability of data consistency and service must be protected when flexible.The existing distributed caching system is scarcely supported dynamic retractility, as OSCache, and Memcached, Terracotta EX etc., system need restart when scale is expanded, and causes service disruption and loss of data, can't adapt to demands of applications.
But there is the challenge of following two aspects in the distributed caching system that realizes a dynamic retractility: 1) during the buffer memory dynamic retractility, the partial buffering data need be finished migration between node.The buffer memory service disruption causes asking failing or the losing of important business data, status data in the transition process, a plurality of copies of multi-user concurrent operating data and the inconsistent grade of data that produces all can be brought the interests loss to the user.In addition, the data migration also will be considered migration overhead and current system loading conditions, rationally controls the migration progress, avoids causing system overload so that the service collapse because expense is excessive.Therefore, how to ensure that lasting availability and the data consistency of buffer memory service becomes a major challenge in the telescopic process.2) can have some hot spot datas district in the caching system, namely visit the higher relatively zone of load, characteristics and the user access pattern of application often depended in the position in hot spot data district.The hot spot data district very easily becomes system bottleneck, and then the availability of buffer memory service is brought influence, so needs in the telescopic process to consider that hot spot data is carried out rebalancing method to be handled.
In the distributed caching system, the data balancing strategy on each caching server node, when obtaining data is retrieved the data distributed store from the node that calculates, avoided the formula of flooding to search, and has improved the efficient of data search.Early stage popular strategy is the remainder algorithm, and this algorithm is realized simple, and its core concept is that the remainder according to the server node number carries out data and distributes.The deficiency of remainder algorithm is when server node changes (when for example node failure or node are flexible), acute variation can take place in the server node that key assignments is mapped to, have only a spot of data still in former server node visit, other data have then been transferred to new server node, and hit rate declines to a great extent.
In order to address the above problem, scholar (the D.Karger of the Massachusetts Institute of Technology, E.Lehman, T.Leighton, R.Panigrahy, M.Levine, and D.Lewin.Consistent hashing and random trees:distributed caching protocols for relieving hot spots on the World Wide Web.In Proceedings of the Twenty-Ninth Annual ACM Symposium on theory of Computing.STOC ' 97.pp.654-663.1997.) has proposed everybody present known consistency hash algorithm (Consistent hashing) in 97 years, this algorithm is at present still at application layer multicast, fields such as P2P and buffer memory platform are extensive use of.Its basic thought is that the output area of hash function is defined as a fixing annular space, is referred to as the Hash ring, and each server node shines upon mutually with the value that ring is gone up a certain random site representative.Each data cached key assignments also is mapped to position on the Hash ring, advances clockwise on ring then, and when the position of finding first to represent certain server node, this store data items is in this caching server.The consistency hash algorithm has solved data partition and routing issue preferably, has reduced node to greatest extent and has changed the influence that data are distributed and cause.Be expressed as Fig. 1.Document (D.Hastorun, M.Jampani, G.Kakulapati, A.Pilchin, S.Sivasubramanian, P.Vosshall, and W.Vogels, " Dynamo:Amazon ' s highly available key-value store; " In Proceedings of ACM Symposium on Operating Systems Principles (SOSP ' 07) .pp.205-220,2007.) the consistency hash algorithm is conducted in-depth analysis, think that there are following 2 deficiencies in it: 1) service node random distribution on the Hash ring causes the uneven distribution of data and load; 2) algorithm has been ignored the performance difference of service node.In actual scene, the focus subregion can't be eliminated by improving the distributed hash algorithm fully to the influence of system.
People such as the Chiu of Washington State University (D.Chiu, A.Shetty, G.Agrawal.Elastic Cloud Caches for Accelerating Service-Oriented Computations.In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC ' 10) .pp.1-11,2010.) study at the dynamic retractility problem of caching system, and a kind of data migration strategy based on greedy method has been proposed.When user's request is inserted data to cache node n, if this node free memory inadequate resource, meeting trigger data migration event, migration algorithm is moved to other nodes with the partial data of node n.The lightest cache node of present load is paid the utmost attention in choosing of destination node.If migration may cause target node memory to overflow, then distribute a new node.When migration is finished, upgrade the Hash mapping table synchronously.Choose two the lightest cache nodes of present load when node shrinks and carry out the data merging.The deficiency of above-mentioned research work mainly contains 2 points: the one, and the node telescopic process is not considered the influence that heterogeneous nodes, focus regional addressing bring, can't dynamic equalization focus subregion.The 2nd, the data consistency guarantee is not provided, the data migration operation may be subjected to the influence of network factors to cause operation failure or loss of data, and client Hash mapping table information inconsistency may cause certain customers to obtain a plurality of copies of object requests failure or concurrent operations data and produce problem of inconsistency etc.
The research work of the Flavio of Eidgenoess Tech Hochschule (ETH http://systems.ethz.pubzone.org/servlet/Attachment? attachmentId=109﹠amp; VersionId=1378371) be primarily aimed at the dynamic retractility problem of cloud storage system, this work is based on the method on-line monitoring focus subregion of statistics, and the partial data with the focus subregion migrates to the lighter neighbor node of load simultaneously.In order to ensure that migration finishes smoothly, the hash algorithm of its use can keep the sequencing between the Key.Deficiency is that load-balancing decision finishes in each cache node this locality, lacks global information, and system often needs just can enter stable equilibrium state through iteration repeatedly like this, and convergence rate is slow, and the expense of introducing is bigger.
Summary of the invention
The objective of the invention is to overcome the problem that exists in the existing scheme, a kind of distributed caching dynamic retractility method and system of holding load equilibrium are provided.How the resource utilization of each cache node in the inventive method on-line monitoring system stretches based on the weighting load value that calculates and system's average load value decision system.Data balancing problem during at dynamic retractility, the inventive method consider that hot spot data to the influence that system availability causes, has proposed a kind of load-balancing method that is applicable to isomerous environment.Data consistency security problem during at dynamic retractility, the present invention has realized a kind of Data Access Protocol based on three phase requests, simultaneously in order to eliminate the influence that the data migration causes system availability as much as possible, the present invention adopts a kind of controlled data migration method effectively to control the migration progress, reduces migration overhead.
To the caching system dynamic retractility time, the inventive method relates generally to following three kinds of fundamental mechanisms: data route, data balancing and information synchronization mechanism describe in detail this three partial content below:
(1) routing mechanism of distributed caching can be divided into initiatively initiatively three kinds of (Server-driven) and load equalizer routes of (Client-driven), server of client.The inventive method adopts the initiatively mechanism of route of client, i.e. client maintenance routing table, and will ask directly to be forwarded to destination node according to routing table information.Than other two kinds of mechanism, there is following advantage in this mechanism: data are routed to destination node only needs single-hop, has effectively reduced the network overhead that the multi-hop routing forwarding is brought; Caching server need not be born route forwarding function, is conducive to the raising of its performance; Response time is short.
(2) the data balancing mechanism of the present invention's employing has increased the support to heterogeneous nodes, and each cache node i is endowed an initial weight w according to its disposal ability
iWith whole Hash space be divided into some Q piece of data subregions that wait size (Q>>K, Q>>N, K is the backup number, N is buffer memory service node number), cache node i is mapped to T in the following manner
iOn the individual subregion:
The mapping relations of subregion and cache node are kept in the partitioned server mapping table (i of mapping table capable preserved i subregion corresponding cache node).The mode of customer end adopted secondary Hash mapping is come the locator data item, is expressed as Fig. 2.At first the key value with data item is mapped on the Hash ring by a hash function; positional information is labeled as token; by secondary Hash mapping function the token that gets access to being mapped as a value then is subregion sign between 1 to Q; after obtaining the subregion sign, client is inquired about the cache node of depositing this subregion from the partitioned server mapping table.Can change the partitioned server mapping relations during caching system dynamic retractility, at this moment, client needs to upgrade synchronously this mapping table (being also referred to as routing table or subregion routing table).
(3) the subregion routing table is the key that whole caching system is able to true(-)running.During the migration of caching system dynamic retractility and data, the subregion routing iinformation can change, and therefore need have a sets of data synchronization mechanism to guarantee that the content of routing iinformation can in time upgrade and obtain effective affirmation.Existing data synchronization mechanism mainly comprises client affirmation, server affirmation, TTL (Time To Live) and piggybacking (Piggyback Validation) mechanism, and subregion routing iinformation of the present invention adopts Piggyback Validation mode synchronously.Because the check information amount is less, the network overhead that this synchronization mechanism is introduced is also less relatively, and the client and server end need not to record too much information simultaneously, can not produce overhead.Particularly, when client sends data access request to caching server, in request, understand incidentally subregion route version information and synchronizing signal; During the service end analysis request, then in the request response, add synchronized result as the existence of finding synchronizing signal; Client is as receiving asynchronous signal, and then the routing iinformation that please look for novelty to server is again asked otherwise finish this.
Based on above-mentioned three kinds of mechanism, technical solution of the present invention can be expressed as Fig. 3, may further comprise the steps:
1) in each sampling window of feature sampling time section (or fixed reference feature sampling time section), monitors the utilance of processor (CPU), internal memory (Memory) and the network resources such as (Network) of each cache node.
When 2) each feature sampling time section finished, each cache node was responsible for calculating the weighting load value in this feature sampling time section, account form such as equation (2):
L wherein
iRepresent the weighting load value of cache node i in this feature sampling time section, α, beta, gamma represent CPU respectively, the weights of internal memory and network, CPU
Ij, Mem
IjAnd Net
IjRepresent the CPU of node i in j sampling window respectively, internal memory and network resource utilization, M are represented the sampling window number.
Each cache node is sent to the cache cluster manager with the weighting load value that calculates, and manager is responsible for calculating the average load value of system in this feature sampling time section and how system to be stretched and makes decisions account form such as equation (3):
Wherein
Representative system average load value, N represents the cache node number.
When the average load value of system is higher than threshold value thre
MaxThe time, the XM extended operation; When being lower than threshold value thre
MinThe time, the XM shrinkage operation.The cache cluster manager is responsible for coordinating each cache node and is finished the renewal of subregion routing iinformation and the migration of data partition.
3) during the cache node dynamic expansion, the partial buffering data need be finished migration between node.The inventive method is considered the influence that hot spot data causes system availability, on the basis of data with existing partition scheme, by the target of the partition data on the mobile unbalanced service node with the realization load balancing.Existing cache node number is N in the supposing the system, and target is to find the solution N+1 node dynamically to add fashionable data partition migration scheme.The inventive method is used transBytes
iThe averaging network flow of expression cache node i in feature sampling time section uses w
iThe weights of expression cache node i.The variance D (T) of employing weighted network flow T comes the balanced intensity of marked network flow, definition weighted network flow T
iT is as follows with the average weighted network traffics:
In a feature sampling time section, if the weighted network flow T of cache node
iSatisfy
(ε is threshold value) thinks that then this node is in the load balancing state; Otherwise, think to be in non-balanced state.Definition migration node set is for satisfying
The set that the node of condition constitutes.According to weighted network flow T
iAnd the relation between the average weighted network traffics will be moved node set and is further divided into the node set MigrationOutSet that moves out (node i belongs to the node set of moving out, and and if only if
) and the node set MigrationInSet that moves into (node i belongs to the node set of moving into, and and if only if
).Simultaneously, new node is added among the MigrationInSet.The basic element of migration is data partition, and each partition network flowmeter of cache node i is shown set { t
I1, t
I2, t
I3T
Ik, k represents the number of partitions of this node.Migratory direction is constrained to unidirectional, and namely data can only the node from MigrationOutSet be moved on the node among the MigrationInSet.Target function is shown in equation (5):
Target function:
The target of finding the solution data partition migration scheme is the variance sum minimum that makes migration node set weighted network flow T, and wherein S represents to move node number in the node set (S<N).The remaining space of supposing the node of moving into is space to be filled, and the redundant data of the node of moving out is article to be loaded, and then this problem is converted into the many knapsacks of 0-1 (0-1MKP) problem.The 0-1MKP problem is a np hard problem (S.Hanafi, A.Freville.An efficient tabu search approach for the 0-1 multidimensional knapsack problem.European of Operational Research, 1998,106:659~675), the present invention uses branch and bound method to find the solution the approximate optimal solution of this problem.
4) based on finding the solution the data transference package that obtains in the step 3), each cache node of cache cluster manager coordinates is finished the data migration operation.Data consistency in the transition process and service availability need be protected, the inventive method adopts the Data Access Protocol of three phase requests to realize the consistency visit of migration data, adopt controlled data migration method to avoid migration operation to cause excessive expense to caching server, thereby ensured the lasting availability of buffer memory service.
1. based on the Data Access Protocol of three phase requests
In order to ensure the continuity of user's visit in the transition process, after the data partition migration is finished in employing, again with the strategy of subregion from the knot removal of moving out.When data were moved, same data may be present in move out node and the node of moving into simultaneously like this, and the version of the two may be inconsistent.Consider when migration operation begins that the buffer memory service node is new routing information more, and client and service end synchronous routing table version number and judge whether current routing iinformation is latest edition, if for legacy version then need to safeguard again a up-to-date routing table.Therefore in data migration process, when taking place client to the visit of migration data, can adopt the preferential strategy of the node of moving into, i.e. (V when the data of move into node and the node of moving out produce conflict
i≠ V
o), cache client is thought the node data V that moves into
iEffectively, and with these data return and (namely directly return V
iGive the user), be expressed as Fig. 4.More during new data, owing to be difficult to monitor the transition state of each data item, agreement adopts the equal method for updating in the side of moving out and the side of moving into, and namely client at first can be upgraded the data V of the node of moving out
o(more new data namely directly covers), after getting access to the successful information of renewal, upgrade the node of moving into again, lose to avoid upgrading that (the renewal is here lost and referred to these data successfully migrated to when moving into node when the node of moving out before Data Update request arrival, if only do not upgrade synchronously the node of moving into moving out node updates this moment, then the data got from the node of moving into of client are inconsistent datas of legacy version), be expressed as Fig. 5.
Owing to adopt cache client subregion routing mode initiatively, for realizing the consistency target of migration data visit, client need adopt corresponding data access mode according to server state, thereby hold mode is synchronous.In order to address this problem, the inventive method has realized a kind of Data Access Protocol based on client three phase requests.Cache client can be preserved the routing iinformation of a plurality of versions in this agreement, will ask simultaneously to handle to be divided into three phases.Phase I, it is synchronous that client is carried out routing iinformation based on piggybacking (Piggyback Validation) mechanism and buffer memory service node, and the result of information synchronization guarantees that the routing table that requesting client has up-to-date two versions (is expressed as RT
nAnd RT
N-1); Second stage, client finish routing iinformation synchronously after, use the subregion routing table RT of latest edition in the caching system
nCarry out route, data cached based on the request of piggybacking mechanism; Phase III, the data value that returns according to the second stage caching server and the server state (being divided into transition state and stable state) that returns handle accordingly.If this moment, server was in transition state, client can judge at first whether the data value that server returns is empty.If be not empty, show that request msg successfully migrates to destination node, then client directly returns to this data value the application service of the request of sending; Otherwise if the data value that returns shows that for empty request msg does not migrate to destination node as yet, then client is according to the routing iinformation RT of last revision
N-1Find out the node of moving out, and send request of data to this node.If server was in stable state (expansion was finished this moment, the free of data transition state), show that transition process completes successfully, this moment, data cached memory location in caching system had uniqueness, and client directly returns to this data value the application service of request.
2. controlled data are moved
During the cache node dynamic expansion, resource utilization ratio is positioned at saturation point, and the data migration operation can be introduced extra network and computing cost, node is providing service if move out this moment, this part expense might cause system service abnormal end, and too fast migration velocity may be aggravated system originally with regard to already present resource pressure simultaneously.Finish smoothly in order to ensure migration operation, eliminate simultaneously the influence that the data migration causes system availability as much as possible, the present invention adopts a kind of controlled data migration method, namely rationally control migration operation progress (data cached migrate to low load node from the high capacity node usually, so migration operation is bigger to the node influence of moving out) according to the performance condition of node of moving out.The principal element that influences caching server request treatment effeciency is the I/O network bandwidth, in the inventive method, the node of moving out is monitored its network overhead, and when the network flow velocity reached the bandwidth threshold of a certain setting, the node of moving out can reduce θ % with the data migration velocity.
Meanwhile, if the long or frequent interruption of transition process duration then can cause system to be in transition state for a long time and increases extra data sync expense, reduce the entire system performance.Therefore, can not interrupt the data migration operation fully when slowing down the data migration velocity, the minimum available network flow velocity of the inventive method setting data migration is 10% of this meshed network bandwidth.
When 5) cache node dynamically shrinks, at first still be specified data migration scheme, two the lightest node i of load in the selecting system (using L to represent) ', j ' (L
J '<L
I ') carry out subregion and merge node i ' and all subregions all migrate to node j '.In remaining cache node, according to weighted network flow T
iWith the average weighted network traffics
Relation, will satisfy
The migration node set of condition is further divided into move out node set and the node set of moving into.Use the subregion bound method to find the solution data partition migration scheme, target is the variance sum minimum that makes migration node set weighted network flow T.Finish the data migration based on above-mentioned migration scheme, the consistency guarantee in the data migration process is finished based on the Data Access Protocol of three phase requests, adopts controlled data migration method effectively to control the migration progress simultaneously.When migration operation finishes, with node i ' from cache cluster, remove, be expressed as Fig. 6.
Compared with prior art, the present invention has following technical advantage:
1) the data balancing mechanism of the present invention's employing has increased the support to heterogeneous nodes, considered the influence that the focus subregion produces system availability in the running simultaneously, to ensure network traffics equiblibrium mass distribution between each cache node, optimized resource utilization ratio by the data partition on the mobile unbalanced service node.
Data consistency security problem during 2) at dynamic retractility, the present invention has realized a kind of Data Access Protocol based on three phase requests, and this agreement adopts piggybacking mechanism subregion routing iinformation synchronously, and the synchronization overhead of introducing is less, and is easy to implement.In three phase requests are handled, the routing iinformation of a plurality of versions of client maintenance, in conjunction with and server between route and state synchronized, solved the inconsistent problem of the data access in the transition process; By adopting controlled data migration method rationally to control the migration progress, the influence of effectively having avoided the data migration that system availability is caused.
3) the present invention's flexible resource of can be the buffer memory platform is supplied with Performance tuning and is provided support.By all kinds of resource utilizations of cache node are detected, the performance bottleneck in the helpdesk administrative staff discovery system in time, and finish the dynamic expansion of cache node automatically, ensured continuity and the consistency of service quality; When resource utilization ratio is low, finish the dynamic contraction of node automatically, realize the supply as required of resource, reduced artificial participation simultaneously.
Description of drawings
Fig. 1 represents the consistency Hash;
Fig. 2 represents the secondary Hash mapping;
Fig. 3 represents distributed caching dynamic expansion method;
Get operation when Fig. 4 represents the data migration;
Update operation when Fig. 5 represents the data migration;
Fig. 6 represents the dynamic contraction method of distributed caching;
Fig. 7 represents deployment topologies figure;
Fig. 8 represents the processing of three phase requests;
Fig. 9 represents to move control flow;
Figure 10 represents dynamic expansion performance test result.
Embodiment
The invention will be further described below in conjunction with specific embodiments and the drawings.
Whole distributed caching system is made up of caching server (Cache server), cache client (Cache Client) and buffer memory cluster manager dual system (Cache Admin) three parts.Pass through network connection between the three.Each caching server independent operating passes through Management Agent unified monitoring and management by the cache cluster manager.Management Agent and caching server are positioned at same physical node, are responsible for generating JMX management MBean, and after receiving the control command that the cache cluster manager sends, the Management Agent meeting is adaptive automatically should order and the corresponding operation of control buffer memory service processes execution.
In the cache cluster manager, the topology monitor adopts the multicasting technology monitor server node topology based on Jgroups to change, and obtain the performance information of each service node by JMX remote access Management Agent, finally provide unified monitoring view to control whole cache cluster.The cache cluster manager is mainly by multicast communication assembly, topological monitor, controller, and the cluster management module, Web end control desk is formed.Wherein the cluster management module mainly provides data partition heavily to distribute and two major functions of cluster scale adjustment (comprising dynamic expansion and contraction).
Caching server mainly is divided into buffer memory service and Management Agent two parts.Wherein the buffer memory service is made up of data management module, order control engine, state management module and migration management module.Data management module mainly is in charge of the memory headroom that has distributed, also is responsible for data cached tissue, storage and inquiry simultaneously.Order control engine mainly is responsible for according to buffer memory process of commands node state scheduling cache request, communicates the agreement relevant treatment, and finishes the cache request processing procedure.State management module mainly is responsible for the condition managing of caching server, comprises that the piggybacking to route information is handled in server migration state, cache cluster scale variable condition and the client-requested.The migration management module is responsible in internodal migration work.The major responsibility of Management Agent is to order cycle management for the caching server cluster provides Topology Management and buffer memory waiter.
Cache client mainly is made up of main modular such as cache client API, client kernel, subregion selector, request transponder, many versions configuration manager, adaptive integration modules, is responsible for communicating by letter and state synchronized etc. between application and caching server.Wherein, cache client API, client kernel and subregion selector have been realized the basic function of client, request transponder and many versions configuration manager are in order to support the Data Access Protocol of three phase requests that the inventive method proposes, and adaptive integration module is then in order to strengthen maintainability and the ease for use of existing caching system.Cache client API provides and has comprised reading and writing, a series of service interface of deletion data.The client kernel mainly comprises node administration module, connection management module and communication module, and the Core Feature of cache client is provided.Wherein the connection management module be responsible for the maintain customer end to all connections of caching server with use these to be connected necessary communication.Cache client realizes that based on JAVA NIO than conventional congestion I/O model, JAVA NIO has efficiently, the less advantage of resource overhead.Conventional congestion I/O model often adopts the mode that is connected to form connection pool that is pre-created some to improve treatment effeciency, and only need creating a connection, NIO gets final product, so just significantly reduce the expense of thread creation and switching, can better support high concurrent processing of request.After the node administration module is obtained connection, realize the network data exchange of NIO mechanism by the calling communication module.The subregion selector has been realized a kind of route selection algorithm based on the consistency Hash.When data access, at first corresponding subregion is found in mapping based on key assignments, finds this section post corresponding cache server according to the subregion routing table then.The request transponder mainly is responsible for handling the synchronizing information of returning from server end, and carries out subsequent request according to this information and transmit.Many versions configuration manager is responsible for safeguarding the routing table information of a plurality of versions.During client terminal start-up, configuration manager can connect any one caching server node, obtains and resolve routing iinformation, realizes the self-configuring of connection attribute; When the caching server end was in transition state, configuration manager can provide the support of historical routing iinformation.Simultaneously, for ease of caching system and existing application server (Tomcat for example, Jboss etc.) slitless connection, the state backup and high available support of loose coupling are provided, adaptive integration module is responsible for the status object module (Session) of each application server is carried out adaptive, and so only configuration file need simply be set just can use the buffer memory service.Adaptive integration module provides general Java object sequence method simultaneously, by the Java object is converted into the XML form, realizes serializing and unserializing flexibly.
As the experimental situation of present embodiment, front end uses LoadRunner to generate load, and middleware Web container adopts the request of Tomcat process user, and the rear end is DB2 database and distributed caching system.Wherein, the distributed caching system is mainly used to the buffer memory business datum, accelerates application access, eliminates the database access bottleneck simultaneously.Deployment topologies can be expressed as Fig. 7, and the environment for use configuration is as shown in table 1.
The Web that present embodiment adopts is applied as a simple on-line shopping system, comprises functions such as goods browse, commercial articles ordering, acknowledgement of orders.The business datum of this application is kept in the DB2 database, and initial book data amount is 3000000 books records, and every books record comprises bibliography information and book contents information.This is used by using the data in the transparent access cache system of Hibernate framework.Test script adopts LoadRunner to record, and comprises internet book store's page browsing, submits operations such as shopping list to, adds 0.3 second think time (Think time) between each request.
The configuration of table 1 environment for use
Present embodiment method idiographic flow is as follows:
1) in the present embodiment method, Management Agent is responsible for providing all kinds of statistical informations of buffer memory service end, as: buffer memory reading times, update times, hit-count, each partition network flow etc.; Buffer memory service node status monitoring function is provided simultaneously, as: CPU, internal memory and network utilization etc.In each sampling period of feature sampling time section, Management Agent is responsible for monitoring the utilance (for the ease of calculating, all representing with the percentage form) of the resources such as CPU, internal memory and network of place node.It is 300 seconds that present embodiment defines each feature sampling time section, is made up of 10 sampling windows, and each sampling window width is 30 seconds.
2) when each feature sampling time section finished, Management Agent was calculated the weighting load value in this feature sampling time section.α, beta, gamma represent CPU respectively, the weights of internal memory and network, and the critical system resources of considering the distributed caching system is internal memory and network I/O, and cpu resource priority is low slightly, and present embodiment is with α, and beta, gamma is made as 0.1,0.4 and 0.5 respectively.Whether the cache cluster manager calculates the average load value of system in this feature sampling time section based on long-range each the node monitoring information that obtains of JMX, need to stretch and how to stretch based on this load value decision system.Present embodiment is with the threshold value thre of node extended operation
MaxBe defined as 70%, the threshold value thre that node shrinks
MinBe defined as 30%, namely when system's average load value is higher than 70%, the XM extended operation, when being lower than threshold value 30%, the XM shrinkage operation.
3) during the cache node dynamic expansion, at first need to solve new node and add fashionable data partition migration scheme.In the present embodiment, Server1, Server2 and Server3 are existing server node in the caching system, and the 4th node Server4 is node to be added.The cache cluster manager is based on the long-range network traffic information (transBytes) that obtains 3 service nodes of JMX.Because each cache node configuration is identical, so weight w
iAll be made as 1.Based on above-mentioned information, the cache cluster manager calculates the weighted network flow T of each node
iAverage weighted network flow value with system
In a feature sampling time section, according to

And the relation between the threshold epsilon is divided into equilibrium state and non-balanced state with the cache node state, and present embodiment is made as 15% with threshold epsilon.Be in the node of non-balanced state, the weighted network flow surpasses the node of mean value and puts into the node set of moving out (MigrationOutSet), puts into the node set of moving into (MigrationInSet) with new node Server4 and less than the node of mean value.The basic element of migration is data partition, and migratory direction is constrained to the partition data on the cache node among the MigrationOutSet and moves on the node among the MigrationInSet.Based on each partition network flow information of cache node and migration node set information, the approximate optimal solution that present embodiment adopts branch and bound method to find the solution the migration scheme, target is the variance sum minimum that makes migration node set weighted network flow T.
4) based on finding the solution the data transference package that obtains in the step 3), the cache cluster manager at first uses controller to send new subregion routing iinformation to upgrade routing configuration to all service nodes, uses controller to control each node then and finishes the data migration by migration plan.In order to ensure the data consistency in the transition process, the inventive method adopts the Data Access Protocol based on three phase requests, the state management module of service node is responsible in service end transition state and the client-requested piggybacking of route information is handled, the client-requested transponder is responsible for handling the synchronizing information from service end, carries out subsequent request and transmits processing.
1. based on the Data Access Protocol of three phase requests
Be the explanation handling process, suppose when initial to have three cache node Server1, Server2 and Server3 in the system that whole Hash ring is divided into the data partition of sizes such as 12 parts, each self-contained 4 parts of three cache nodes, the subregion routing table is expressed as RT
N-1(Server1, Server2 and Server3 are abbreviated as S1, S2 and S3 respectively in the routing table).New node Server4 adds during the system dynamic expansion, finds the solution the data transference package that obtains according to step 3), and subregion 4,8 and 12 need be moved among the Server4 by former service node, and up-to-date subregion routing table is expressed as RT
n, as shown in Figure 8.Suppose that the migration of present data do not finish as yet, client receives that a key assignments is the data access request of Key1.
(1) client is carried out Hash calculation (adopting Distributed Consistent Hashing algorithm) to this key assignments, and result of calculation is mapped to Hash ring a position, through the secondary Hash mapping position a is mapped in the data partition 4 then.Suppose that the current routing table version of client is t (t≤n-1, consider if client request msg not for a long time, the routing table version may be lower), according to this routing iinformation, locator data subregion 4 is stored in (value of D may be 1,2 or 3) on the server node Server D, user end to server node Server D sends data access request, version synchronizing information incidentally in the request;
(2) caching server node Server D receives incidentally and carries out the version verification after the information, if t<n-1 then can return the routing table RT of up-to-date two versions
nWith RT
N-1If t=n-1 then can return up-to-date routing table RT
nThe request of phase I this moment is finished;
(3) client reads the subregion routing table RT of latest edition
n, according to routing iinformation, Key1 is stored on the server node Server4, and client sends data access request to Server4.
(4) since this moment the data migration operation do not finish as yet, Server4 in the request response incidentally Returning mark bit table prescribed server be in transition state.If these data successfully migrate to Server4, then directly return this data to the application service of the request of sending, if do not migrate to Server4 as yet, then return null value.This moment, the second stage request was finished;
(5) client is received when server is in the message of transition state, can judge whether the data value that returns from server is empty.If be not empty, then directly this data value is returned to the application service of the request of sending; Otherwise, handle if the data value that returns for empty, then needs to carry out extra request.Client terminal local reads the routing iinformation RT of last revision
N-1, according to this version routing iinformation, Key1 is positioned on the server node Server1, and client sends data access request to Server1 again;
(6) Server1 will ask result to be returned.The request of phase III this moment is finished;
(7) cache client is returned result to the application service of the request of sending.
2. controlled data are moved
Migration management module in the buffer memory service node is responsible for data partition in internodal migration work, when migration orders arrival and parsing to be finished, creates the background migrate thread and carries out data migration work.The migration thread obtains the block chained list of corresponding subregion by migration subregion view, begins to carry out the data migration operation from the afterbody of chained list, namely adds the high priority data migration of caching system at first, and the back adds the data of caching system and moves at last.Watch-dog monitoring system network traffics expense after the migration thread is finished the volume of transmitted data of fixed size, is obtained system mode from watch-dog simultaneously, and adjusts the migration velocity in next data migration cycle.In the methods of the invention, when the network flow velocity reaches the network bandwidth threshold value of setting, migration velocity is reduced by 20%.The minimal network flow velocity of reserving is 10% of this meshed network bandwidth.After all data partition migrations finished, the replacement server state was stable state.Be expressed as Fig. 9.
Present embodiment is tested at this dynamic expansion method.Whole experiment is used the concurrent visit of LoadRunner analog subscriber, and each affairs comprises 10 servlet requests, and average transaction response time (Average Transaction Response Time) data are as evaluation criterion in the collection experimentation.Be expressed as Figure 10.Have Server1, Server2 and three service nodes of Server3 in the caching system when initial, the user concurrent amount is made as 600.At warm-up phase, because using, web constantly business datum is put into caching system, and the average transaction response time of system descends gradually, and after the caching system space reached certain load, application performance tended towards stability, and the average transaction response time is 12.9 seconds.Further increase the visit load this moment, and the user concurrent amount is transferred to 1000, and the caching server cluster arrives the resource saturation point, and dynamic expansion adds 1 the new continuity of cache node server4 to ensure service quality.Consider that data migration process can introduce overhead to system, average transaction response time this moment slightly rise (amplitude is 0.8 second).After expansion process was finished, system processing power increased, and the response time descends again, finally reaches stable again, and the average transaction response time is 8.1 seconds, compared when stablizing before, had reduced 4.8 seconds.
For the further validity of proof load equalization algorithm, the variance D (T) of present embodiment definition weighted network flow T and average weighted network flow value T square ratio be the unbalanced degree of load of cache cluster.Collect the unbalanced degrees of data of load after new node adds, be expressed as table 2.Dynamic expansion method with not working load equilibrium is compared, and the inventive method has better load balancing effect, and the unbalanced degree of system load has reduced by 82.4%.
The unbalanced degree of load of two kinds of methods of table 2
| The dynamic expansion of holding load equilibrium |
The dynamic expansion of holding load equilibrium not |
The two ratio |
| 6.28*10
-3 |
3.56*10
-2 |
0.176 |
5) when system's average load value is lower than threshold value 30%, the XM shrinkage operation.In the present embodiment, supposing that Server1, Server2, Server3 and Server4 are existing server node in the caching system, at first is the establishment of migration scheme.Two cache nodes of weighting load value L minimum (are assumed to Server2 and Server3 and L in the selecting system
Server3<L
Server2), all subregions of Server2 are all migrated to Server3.To consider in addition cache node is carried out load balancing, in remaining cache node Server1 and Server4, will satisfy
Node add the node set of moving out, will satisfy
Node add the node set of moving into, threshold epsilon ' be made as 10%.The approximate optimal solution that present embodiment adopts branch and bound method to find the solution the migration scheme.Finish the data migration based on above-mentioned migration scheme, use the Data Access Protocol of three phase requests to ensure data consistency in the data migration process, adopt controlled data migration method effectively to control the migration progress.When migration operation finished, Server2 removed from cache cluster with node.