Embodiment
The present invention is described further below in conjunction with accompanying drawing.
The present invention just must use Magnetic Disk Controller for the state of controlling disk and the data of reading and writing on the disk.Magnetic Disk Controller is the logic interfacing circuit between CPU and the driver, and it sends tracking, read/write and control signal to driver, as shown in Figure 1 from the CPU reception request command.
The reciprocal process of kernel and Magnetic Disk Controller is exactly by the I/O port related register content in the controller to be set, and obtains the object information of operation by register.
Main frame is to realize by two groups of registers on the Magnetic Disk Controller to the control of ide interface disk.One group is command register block (Task File Registers), the port address of I/O is 1F0H~1F7H, its effect is to transmit order and command parameter, another group is control/diagnostic (Control/DiagnosticRegisters), the port address of I/O is 3F6H~3F7H, and its effect is the control disc driver.
Following table is that Task File Register life 7 makes registers group.Carry out in the read-write process at disk,, identify different registers with identical address in order to save the I/O address space.For example, as I/O port address 1F7h in the table 1, when disk writes data as command register, and when the disk reading of data as status register.
To the control of IDE disk, can write corresponding 16 bit instructions to command register by I/O port one F7h; State and result that instruction is carried out can obtain by the content of I/O port one F7h read status register, and be as shown in the table.
The state of disk can represent that present embodiment is an example with Hitachi Travelstar4K40 notebook IDE disk with the power consumption mode of disk, and state design is as shown in the table:
Switching between the state can write corresponding 16 bit instructions to command register by I/O port one F7h and realize.For example: allow disk enter the Sleep state, can write E6h to command register.If current operating point is inquired about, can write E5h to command register, if the value that Magnetic Disk Controller returns the sector number register is for FFh then represent that disk is in Active or Idle state, if be 0 then be in Standby or Sleep state.
The Disk State conversion realization under Linux has dual mode: the one, and the mode of carrying out direct communication by I/O port and disk, the 2nd, the mode that communicates by disk driver interface and disk.
1, I/O port direct communication mode
This mode realizes by directly calling a series of kernel function such as inb, outb, inw, outw.When the I/O operation must be carried out with specific order, can between operation internal memory barrier (memory barrier) be set, Linux provides 4 grand solving:
#include<linux/kernel.h>
Void barrier (void) // forbid to the compile optimization before and after the barrier
#include<asm/system.h>
Void rmb (void) //rmb guarantees that read operation carries out in proper order
Void wmb (void) //wmb guarantees that write operation carries out in proper order
Void mb (void) //mb guarantees that read-write operation carries out in proper order
Read-write I/O port must make a distinction 8,16,32 bit ports, and the IDE disk is 8 a register.Linux kernel provides visit I/O the interface of port, and present embodiment is with 8 reading-writing port functions:
#include<asm/io.h>
Unsigned inb (unsigned port) // byte is read the I/O port
Void outb (unsigned char byte, unsigned port); // byte is write the I/O port
By visit, just can realize the conversion of Disk State to the magnetic disc i/o port.
Be the example that a disk is set to the Standby state below.
/ * with Disk State change over to Standby state */
int?set_standby(){
Outb (0xE0,0x1F7); // write the Standby instruction to command register
Mb (); // read-write barrier
If (inb (0x1F7) ﹠amp; 0xA0==0//read states register judges whether state changes success
﹠amp; ﹠amp; Inb (0x1F1) ﹠amp; 0xFB==0) // the read error register judges whether to make mistakes
Return 1; The success of // state exchange
Return 0; The failure of // state exchange
}
2, disk driver communication mode
Utilize that existing ioctl interface---in fact the ioctl bottom has packed a series of functions such as inb, outb, inw, outw, has also increased some inspections such as aspects such as securities simultaneously.
Linux kernel uses gendisk structure (at<linux/genhd.h〉in state) to represent an independently disk unit.A request_queue structured fingers is arranged in the gendisk structure, be used for the management request formation; The formation that request queue is made up of appealing structure (request), request queue preserved description equipment the parameter of treatable request: the number of full-size, the independent segment that in same request, can comprise, the size of hardware sector, alignment requirement etc.
In disk driver, each request structure has all been represented an I/O request, and request can be read-write I/O request, also can be other I/O operation requests.Kernel is handled the request in the request queue by the Request Processing function, realizes the control to disk.
The present invention is packaged into appealing structure to the Disk State conversion instruction, and instruction writes the order member (cmd) of appealing structure; Then request queue is put in request, finish the execution of instruction by carrying out request queue, shown in being achieved as follows under the Linux.
/ * I/O instruction is packaged into I/O request pseudo-code
* hd: driving arrangement pointer, args:I/O order parameter
*/
int?dpm_ide_cmd(struct?ide_drive_t*drive,u8*args){
struct?request?rq;int?err=0;
Ide_init_drive_cmd (﹠amp; Rq, args); //args initialization requests
_ elv_add_request (drive->queue , ﹠amp; Rq); // formation joins request
Ide_do_request (); // carry out and ask
if(rq.errors)
Err=-EIO; //I/O error in operation
return?err;
}
Interface above utilizing only need transmit different order parameters and just can realize the Disk State conversion according to the IDE agreement, has realized 4 interfaces of following table in the DPM framework.
Above several interfaces finally understand call function ide_cmd_ioctl and finish read-write register, thereby realize the conversion of Disk State.With kernel version 2 .6.15 is example, using system calls ioctl when disk is operated, ioctl () will be mapped to/ function generic_ide_ioctl () among the drivers/ide/ide.c, this function calls corresponding low layer functions according to command type (being determined by function parameter cmd), wherein the function ide_cmd_ioctl of command type HDIO_DRIVE_CMD correspondence is the present invention's needs, and its prototype is as follows:
int?ide_cmd_ioctl(ide_drive_t*drive,unsigned?int?cmd,unsigned?long?arg);
The function parameter implication:
Drive: represent the structured fingers of concrete equipment, can from kernel, obtain;
Cmd: command type all is HDIO_DRIVE_CMD in this article;
Arg: group address, the storage of array of sensing order required data.
During different interface interchange ide_cmd_ioctl function, it is different transmitting the parameter arg that comes in, i.e. the needed data difference of different command.Distinct interface corresponding parameters array is as follows:
CheckPowerMode interface: unsigned char args[4]={ 0xE5,0,0,0}; When function returns, args[2] value of having stored the sector count register that returns.
Idle interface: unsigned char args[4]={ 0xE3,0,0,0};
Standby interface: unsigned char args[4]={ 0xE0,0,0,0};
Sleep interface: unsigned char args[4]={ 0xE6,0,0,0};
During specific implementation, can copy the ide_cmd_ioctl function of the function ide_cmd_ioctl rewriting oneself in the kernel, promptly original function be done necessary reduction, to raise the efficiency.
The present invention obtains the method for disk load under linux.The disk load determine by the institute that comes of disk read-write I/O request, therefore gathers read-write I/O and asks track for the unusual key of the behavior of analysis user.Analyze read-write I/O processing of request flow process as shown in Figure 2.
Kernel is read and write data in the disk by calling rudimentary read ll_rw_block ().The major function of this function is to create read-write I/O request for equipment, and is inserted in the request queue of equipment.Actual read-write operation then is that the Request Processing function ide_do_rw_disk () by disk finishes.If ll_rw_block () sets up a read-write I/O request for disk, and current device is during the free time, and it is current request that newly-built request will be set, and directly calls ide_do_rw_disk () and handle request.Otherwise it is pending request will to be inserted request queue etc.When ide_do_rw_disk () finishes a processing of request, will from request queue, delete this request.When ide_do_rw_disk () finishes in Request Processing, all can call ide_do_rw_disk () once more and self go to handle all the other requests in the request queue by interrupting call back function (mainly being read_intr () and write_intr ()).When request queue is empty, ide_do_rw_disk () will no longer send instruction to Magnetic Disk Controller, but withdraw from once.
Because all read-write I/O operations all will be finished by Request Processing function ide_do_rw_disk (), can gather read-write I/O request track by ide_do_rw_disk ().When the DPM module started, initialization had the buffer set of 16 buffer zones, and each buffer size is 128kB; When the each processing read-write of ide_do_rw_disk () I/O request, gather solicited message and write buffer zone; When the DPM module withdraws from, utilize kernel function sys_write () that buffer information is write disk.
The read-write I/O solicited message of gathering at ide_do_rw_disk () is as follows:
collect_disk_trace()
{
if(rq->cmd==READ)dtb_datum[0]=′r′;
else?datum[0]=′w′;
datum[1]=MAJOR(rq->rq_dev);
datum[2]=MINOR(rq->rq_dev);
if(drive->select.b.lba)dtb_datum[3]=′L′;
else?datum[3]=′C′;
datum[4]=rq->nr_sectors>>24;
datum[5]=rq->nr_sectors>>16;
datum[6]=rq->nr_sectors>>8;
datum[7]=rq->nr_sectors;
datum[8]=block>>24;
datum[9]=block>>16;
datum[10]=block>>8;
datum[11]=block;
datum[12]=jiffies>>24;
datum[13]=jiffies>>16;
datum[14]=jiffies>>8;
datum[15]=jiffies;
write_to_buffer(datum);
}
Disk read-write I/O request track, the just behavior of analysis user very easily that use is obtained; Realize the policy optimization of disk according to trajectory analysis, make the behavior of being close to the users more of the Energy Saving Strategy of disk.
Policy optimization is the core of dynamic power management, and the disk power-supply management system is made up of three parts: the user, and buffer queue and disk, as shown in Figure 3.
The general thinking that the DPM algorithm is realized is: do once decision-making every one period set time, adopt certain strategy (being algorithm), according to equipment current state and loading condition, allow equipment change suitable power consumption state over to.This method has adopted regularly drive thought, its shortcoming is: even equipment has been in low energy consumption state and next have longer a period of time can be in low energy consumption state, still can do decision-making every one period set time, this will increase the weight of the burden of system, also can consumed energy, the time will consume than for the strategy of multi-system resource for doing decision-making, its possibility of result is lost more than gain.Desirable method is just to do decision-making when waiting until new request comes always.
The present invention introduces event-driven, promptly adopts event-driven and regularly drives the method that combines and improve top shortcoming, and its basic thought is: identify with global variable α and whether do decision-making (α=1 is represented to do; α=0 expression is not done; The α initial value is 1), when equipment changes low energy consumption state over to and do not have request comes, just α is put 0, when new request comes (event occurs) is arranged α is put 1; Do decision-making at every turn and always check the value of α before, have only α=1 just can do decision-making.Be implemented as follows:
a=1;
time_driver()
{
if(a==1)
{
do_policy();
If (equipment changes low energy consumption state over to and do not have request comes)
a=0;
}
}
event_happen()
{
a=1;
}
The scheduling of time_driver () and event_happen () is separate: time_driver () was called once every one period set time; Event_happen () then is called when request comes is arranged.In Linux, when request comes, system can call a certain function and new request be joined request queue (with the linux2.6.15 kernel is example, the function add_request () among/block/ll_rw_blk.c is used for the request of visit disk is joined request queue), therefore, in specific implementation, as long as in this function, call even_happen () or directly revise the value of variable a just passable.
Each part behavior of disk power-supply management system can be described with probability distribution.User behavior can distribute with the request interarrival time and describe.Equally, the behavior of equipment can be described with distributing service time.State exchange time distribution description equipment is in the behavior of different conditions conversion.The behavior of buffer queue has just been described in contact between the request interarrival time distributes and distributes service time.These a few class probability distribution have just been formed the stochastic optimization problem that needs solve.
Because the service time that interval time and disk are come in user request, user and disk were just formed a M/M/1 queuing system in active state obeys index distribution all.
Have independent identically distributed interval time of sequence in the renewal theory research stochastic process, its process all can be considered a process of restarting in each interval; Poisson process is a special case of renewal theory, its characteristic is that process all restarts at any time, be independent of all (independent increments) that before taken place from any moment process, and the memoryless property of the process that is to say (exponential distribution) is arranged and the duplicate distribution of former process (stationary increment).
With X
1Remember the moment that first time comes, to n 〉=1, with X
nNote (n-1) individual time between n incident, sequence { X
n, n 〉=1} be called come spaced apart.
If counting process N (t), and t 〉=0} comes at interval independent same distribution, and distribution function is any, then is called renewal process.Owing to be independent identically distributed at interval, so on probability meaning, restart in this process of each updated time.
The user is in active state, and the request of idle condition is come and distributed interval time, and it all is that index distributes that the disk service intervals time distributes, and can be considered as Poisson process research.
Disk, is transformed into this cyclic process of idle state again and can be considered as a renewal process to other states from the idle state exchange.System state is changed as shown in Figure 4, and when system was in the active state, it was that index distributes that interval time is come in request, and the service intervals time of disk also is that index distributes; When formation when being empty, system enters the idle state, the idle time during less than 2s request come that distribute interval time and the active state is always; Do decision-making at idle, change disk over to the sleep state, request was come and was remained exponential distribution interval time this moment, and when asking then, system changes the active state over to.Utilize renewal theory, the process of seeking optimal strategy is converted into the stochastic optimization problem.
The equivalence working time is divided into the N section time interval, and as a decision-making moment, Making by Probability Sets that each decision-making constantly enter low power consumpting state be made of constantly by strategy for each five equilibrium.
Definition S={jh|j=1,2..., N} are that system gathers constantly in the decision-making of idle state, and wherein Nh equals the equivalence working time;
P={p (j) | j=1,2..., the policy optimization decision-making set of N} system, p (j) is system changes disk over to the sleep state constantly from the idle state at jh a probability;
E (t
j) system's time interval mathematical expectation of constantly coming to next renewal process at jh;
Q (j) is at E (t
j) performance loss of time;
C (j) is at E (t
j) energy loss of time;
P
StBe the energy loss constraint, then we can be configured under the condition of energy loss constraint, and the optimization problem of performance loss minimum (vice versa) is as formula (1).
Be the energy loss mathematical expectation,
Be energy loss constraint mathematical expectation down, energy loss is less than and equals the energy loss constraint.
Calculate renewal process time interval mathematical expectation
Definition β is the moment that first request is come in the renewal process, P
SrThe probability density function of coming for request.Then mathematical expectation can be represented as formula 2 the renewal process time interval.
E(t
j)=E(t
j|β≤jh,s=jh)+
(2)
E(t
j|β>jh,s=jh)
E (t
j| β≤jh s=jh), is added the time of request service by the time that is in the idle state; E (t
j| β>jh, s=jh),, be transformed into the used time of sleep state by the time that is in the idle state, be in the used time of sleep state, be transformed into the used time of active state, the time of request service is formed.User and disk have constituted a M/M/1 queuing system, and by the waiting line theory knowwhy, the expected time can calculate with formula 3 and formula 4.
(3)
The calculating energy loss
Energy loss is calculated by formula W=P * t, can know the value of P at different conditions by the disk white paper; Utilize formula 3 and formula 4, can obtain the duration of disk at different conditions.Energy loss is calculated as follows shown in the table, and c is the power of disk under different conditions.
The calculated performance loss
Because the performance loss ∝ stand-by period, weigh performance loss with the stand-by period.
Utilize nonlinear programming to find the solution formula (1), can calculate policy optimization decision-making set P={p (j) | j=1,2..., N}, definition distribution function
The disk policy optimization is achieved as follows: when disk entered the idle state, system generated a random number R ND (0<RND≤1), supposes P (j-1)<RND≤P (j), and then disk is that jh enters the sleep state at the idle duration; If in the jh period, there is request to come, then disk enters the active state.Following table is a strategy that calculates with Matlab.
Present embodiment experimentizes at Hitachi Travelstar 4K40 notebook IDE disk, to the experimentize collection of data of Timeout algorithm, prediction algorithm, probabilistic model algorithm and renewal theory model algorithm, with Matlab experimental data is analyzed, and each algorithm experimental result is analyzed.
Test duration is 8 minutes, after starting the DPM module, starts X-window, play a time length then and be 5 minutes 20 seconds video, then write one section Hello World code, start the gcc compiled code, carry out this code, what is not done for last about one minute half.So just can test the performance of the busy and disk of disk policy optimization algorithm when idle.
Fig. 5 is the state change map of disk when not adopting power management.Do not adopt power management, disk only can be along with variation conversion between Idle and two states of Active of service queue, and the performance loss of this situation disk is minimum.
Fig. 6 is the state change map that adopts overtime strategy (the Timeout value is 5s) disk.We can see from figure, and the state of disk changes along with the variation of services request formation, the different just disks 5s that entered the Standby state delay.
From the contrast of top two figure as can be seen, the content by in the read-write related register has realized the conversion to Disk State.