Classification storage power-economizing method
Technical field
The present invention relates to a kind of memory technology of computer realm, relate in particular to a kind of classification storage power-economizing method.
Background technology
Along with the explosive growth of data volume, the server cluster of Storage and Processing mass data is more and more general.The energy consumption problem of these server clusters more and more causes people's concern.
According to statistics, in the cost that builds a server cluster, only the power consumption of server and cooling system has just occupied 20%, and most of server all is in low load condition in the time of majority, generally not higher than 30%, caused very large power wastage.The unnecessary loss that brings in order to reduce as possible this power wastage, the cluster power-saving technology is arisen at the historic moment.
The power-saving technology of current cluster, its key point are to operate on individual servers in the task-set in cluster, and other servers are adjusted into power save mode or turn off, thereby reach the energy-conservation purpose of cluster.
The foothold of current these cluster power-saving technologies is, in cluster, the access of data is to disperse and unfixed, and this is relevant with the distribution of data in whole cluster.Present server cluster has much all been realized load-balancing technique, make data in cluster can be on server mean allocation, prevent the individual servers overload and the idle situation of other servers, to reach the purpose of concurrent processing.
But industrial research shows, only has 20% data to enliven, and be in disabled state and remain 80% data, and the activity of these data also can time to time change.Even if therefore cluster has reached load balancing, but because the access characteristics of data is inconsistent, be bound to occur the individual servers load heavy, and the light situation of all the other server loads.
Current this cluster power-saving technology is with load centralization in fact, makes whole cluster be in again the state of load imbalance, then idle node is adjusted to power save mode.This way is the inverse process of load balancing in fact.Though temporarily solved subproblem, also paid cost, for example the load of each node in cluster is monitored, need the sensor instrument, increased again departmental cost.
Thus, the server utilization rate in cluster is low, wastes in a large number electric energy, is in fact to carry out the inevitable outcome that load-balancing technique brings in whole cluster.If but do not realize load balancing, may make the individual servers in cluster become access bottleneck.Therefore, solve the problem of cluster power consumption, guarantee that again the individual servers in cluster can not become access bottleneck, just need a brand-new data configuration mode.
Summary of the invention
The present invention provides the classification storage that a kind of cost is low, automaticity is high power-economizing method for solving the problems of the technologies described above, and said method comprising the steps of:
The storage automatic classification: cluster starts, and utilizes the present memory hierarchy of host name identification each main frame of identification, and proportionally knot adjustment that memory hierarchy is low is energy saver mode;
Directed access: chosen distance is near, memory hierarchy is high, the accumulation layer of normal mode of operation is stored and file reading;
Seek dsc data: the visit information of each data block in log file, judgement migration opportunity when migration arrives opportunity, according to described recorded information, draw the value of each visit data piece, form from high to low formation according to being worth;
Data block migration: costly data block is moved to the high accumulation layer of memory hierarchy, move to the low accumulation layer of memory hierarchy with being worth low data block.
Preferably, described method also comprises: the self-adaptation adjustment: after Data Migration was completed, more the new data block visit information, restarted monitoring.
Preferably, process described recorded information by the information Valuation Modelling, described data block visit information comprises calling party, access time and data block information.
Preferably, by formation filtering model and route matching model, on the basis of the data block value formation that obtains, form concrete Data Migration task after the information Valuation Modelling is processed, utilize migration to control model and complete Data Migration.
Preferably, described formation filtering model is: fall the not data sectional of needs migration according to threshold filtering, all data sectionals in the formation that forms after filtering have all been determined migratory direction, and threshold value has reflected previous migration results on this memory hierarchy.
Preferably, described route matching model is: after all pieces have all been determined migratory direction in formation, determine migration source and the migration target of close together, the migration source selects preferentially that remaining space is less, load light, the node of normal mode of operation, and the migration target priority is selected the light node of load.
Preferably, described migration is controlled model and is: carry out migration rate and control, use multithreading to carry out in batches described Data Migration task, reduce transition process to the impact of node visit performance in cluster.
Preferably, described renewal data block information, the step that restarts monitoring is specially:
The valuation result of storage data block is used during in order to valuation next time;
For deleted data block, delete in the Visitor Logs that system keeps;
Carry out the threshold value of each memory hierarchy upgrades according to the actual conditions of migration;
The awaking monitoring process is waited for the arrival of Data Migration next time.
Preferably, when the storage automatic classification, described memory hierarchy comprises 2 grades at least, and the criteria for classifying of memory hierarchy is: memory hierarchy is higher, and access performance is better, and the response time of processing user's request is shorter.
Preferably, 40% secondary storage layer and 60% tertiary storage layer are adjusted into energy saver mode.
Classification storage power-economizing method of the present invention is realized the classification memory technology at cluster, use the method for classification storage, use the level storage medium in cluster, the access focus is fixed in the storage of higher level, proportionally the knot adjustment that memory hierarchy is low is energy saver mode, has guaranteed the energy-conservation of cluster and has saved cost.
Description of drawings
Fig. 1 is one embodiment of the invention classification storage power-economizing method schematic flow sheet.
Embodiment
Below in conjunction with accompanying drawing and specific embodiment, the present invention is described in further detail.
As shown in Figure 1, be one embodiment of the invention classification storage power-economizing method schematic flow sheet, the method for classification storage of the present invention comprises the following steps:
Step S1: storage automatic classification.
Cluster starts, and utilizes host name to identify the memory hierarchy that each main frame comprises accumulation layer, and proportionally knot adjustment that memory hierarchy is low is energy saver mode, in the present embodiment, when the hadoop cluster started, by " host name identification method ", system can identify the access performance of each node automatically.In the present embodiment, the secondary storage layer with 40% and 60% tertiary storage layer are adjusted into energy saver mode; Certainly, in other embodiments, what and the ratio that is adjusted to energy saver mode of accumulation layer can be regulated arbitrarily, all belong to the scope of this patent protection.
Step S2: directed access.
Chosen distance is near, memory hierarchy is high, the accumulation layer of normal mode of operation is stored and file reading.
Step S3: seek dsc data.
The visit information of each data block in log file, judgement migration opportunity is when migration arrives opportunity, according to described recorded information, draw the value of each visit data piece, form from high to low formation according to being worth, in the present embodiment, the node in cluster is divided into 3 different memory hierarchys, memory hierarchy is higher, the hard disk access performance of configuration is better, and capacity is just less, and price is also more expensive.Therefore can only be by a small amount of deposit data on the highest node of memory hierarchy.Generally, only have low volume data to be accessed frequently in all data in cluster.We process these information by the visit information of log file by the information Valuation Modelling, draw a value, and this value larger represents the frequent of this data access, and memory hierarchy should be higher; Client reads take piece as unit file, and system all records each read operation of piece, and the content of record comprises: calling party, access time and data block information etc., often read primary system and will generate a record.in particular moment, use information Valuation Modelling is processed these records, the processing of model is to liking piece, the parameter of using has: the access time, access times, number of users, block size, the degree of association of piece and other pieces, the history value of piece etc., utilize formula to calculate specific value, weigh " heat " degree of piece, and form from high to low formation according to being worth, piece value formation after the rough handling of information Valuation Modelling, the Data Migration algorithm utilizes formation filtering model, the route matching model, form concrete migration task, utilize at last migration to control model and complete final Data Migration, formation filtering model filters out by the threshold value on each memory hierarchy the data block that need not to move.What these threshold values recorded is the minimum value of moving the maximal value of data block under all and moving data block on all.All pieces in the formation that forms after filtering have all been determined migratory direction, in other embodiments, and when the storage automatic classification, described memory hierarchy comprises 2 grades at least, the criteria for classifying of memory hierarchy is: memory hierarchy is higher, and access performance is better, and the response time of processing user's request is shorter.
Step S4: data block migration.
Costly data block is moved to the high accumulation layer of memory hierarchy, move to the low accumulation layer of memory hierarchy with being worth low data block, after all pieces have all been determined migratory direction in formation, need to determine the source and target of migration.The migration source preferentially selects remaining space less, load is light, the node of normal mode of operation, if the node space of normal mode of operation is not enough, use the node of energy saver mode to be automatically upgraded to normal mode of operation, the migration target need to have enough spaces to hold the migration piece, preferentially selects the light node of load.Move simultaneously source and the distance of migration target and want enough near, when in formation, all pieces have had concrete migration source and migration target, just formed concrete migration task.Controlling model uses multithreading to carry out in batches these migration tasks, only have 50 threads to be used for migration as every batch, and each node has 5 threads to be used for carrying out the migration task at the most, makes transition process as far as possible little on the impact of node visit performance in cluster.
Step S5: self-adaptation adjustment.
After Data Migration was completed, more the new data block visit information, restarted monitoring, in the present embodiment, in time adjusts the migration cycle according to the trigger condition of migration.Described renewal data block information, the step that restarts monitoring is specially:
The valuation result of storage data block is used during in order to valuation next time;
For deleted data block, delete in the Visitor Logs that system keeps;
Carry out the threshold value of each memory hierarchy upgrades according to the actual conditions of migration;
The awaking monitoring process is waited for the arrival of Data Migration next time.
May there be some node that is in energy saver mode (being positioned on secondary storage and tertiary storage) to become normal mode of operation in transition process, shows that the node remaining space that is in normal mode of operation in this grade storage is not enough.Principle of locality according to data access, there is no the node of Visitor Logs in load is heavy and continuous 2 cycles, be set to energy saver mode, and the node that partly is in energy saver mode is set to normal mode of operation, guarantee that the free space of this grade storage is at more than 10% of this grade storage total volume.
After step S5, return to execution in step S2, the process of data dispatch loops.
Classification storage power-economizing method of the present invention uses the method for classification storage, use the level storage medium in the hadoop cluster, the access focus is fixed in the storage of higher level, with regard to not needing, task is moved like this, only the memory node of low level need be in power save mode and get final product.Guaranteed like this energy-conservation of cluster, can make again the individual servers in cluster can not become the bottleneck of access, killed two birds with one stone.
Be understandable that, for the person of ordinary skill of the art, can make other various corresponding changes and distortion by technical conceive according to the present invention, and all these change and distortion all should belong to the protection domain of claim of the present invention.