CN101043445B - IO dispatch method in network storage system - Google Patents
IO dispatch method in network storage system Download PDFInfo
- Publication number
- CN101043445B CN101043445B CN2007100718360A CN200710071836A CN101043445B CN 101043445 B CN101043445 B CN 101043445B CN 2007100718360 A CN2007100718360 A CN 2007100718360A CN 200710071836 A CN200710071836 A CN 200710071836A CN 101043445 B CN101043445 B CN 101043445B
- Authority
- CN
- China
- Prior art keywords
- command
- read
- commands
- iscsi
- queue
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 14
- 239000003999 initiator Substances 0.000 description 17
- 230000005540 biological transmission Effects 0.000 description 4
- 238000005457 optimization Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000003993 interaction Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Landscapes
- Computer And Data Communications (AREA)
Abstract
本发明提供了一种网络存储系统中的IO调度方法。它包括以下几个计算机可实现的步骤:1、一个新的IO命令将要进入发送队列,判断该命令是否为IO读写响应命令或属于数据流控制命令的短命令;2、如果该命令是IO读写响应命令或属于数据流控制命令的短命令,则到步骤3,如果否,则到步骤4;3、将该命令添到队列头部,返回到步骤1;4、将该命令添加到队列尾部,返回到步骤1。本发明将整个IO通路或仅在交换机中传输的IO命令中的读写响应命令或者属于数据流控制的短命令优先传送,达到提高整个系统性能,和提高资源利用率的目的。
The invention provides an IO scheduling method in a network storage system. It includes the following computer-realizable steps: 1. A new IO command will enter the sending queue, and judge whether the command is an IO read and write response command or a short command belonging to a data flow control command; 2. If the command is an IO Read and write response commands or short commands belonging to data flow control commands, then go to step 3, if not, then go to step 4; 3, add the command to the head of the queue, return to step 1; 4, add the command to At the end of the queue, return to step 1. The present invention transmits the whole IO path or only the read-write response command in the IO command transmitted in the switch or the short command belonging to the data flow control, so as to improve the performance of the whole system and the resource utilization rate.
Description
(一)技术领域(1) Technical field
本发明涉及网络存储领域,具体涉及网络存储技术中的存储区域网络(Storage AreaNetwork-SAN)领域。The invention relates to the field of network storage, in particular to the field of storage area network (Storage Area Network-SAN) in network storage technology.
(二)背景技术(2) Background technology
网络存储技术使用成熟的网络互联代替IO总线来连接主机和存储设备。在网络连接中,经常要使用交换机。对于光纤网络使用光纤交换机,而对于以太网,即iSCSI存储系统,则一般使用普通的以太网交换机。目前在网络存储系统中使用的交换机一般没有针对网络存储应用进行特别的优化。例如,对于iSCSI存储系统,一般使用普通的以太网交换机,交换机对于上层应用的情况未知,所以不可能针对网络存储进行优化。在电子学报,2005年第四期中的《输入排队中抢占式的短包优先调度算法》提出了对于普通路由器的短包优先调度优化算法,但并不是针对网络存储应用的优化方法。Network storage technology uses a mature network interconnection instead of an IO bus to connect hosts and storage devices. In network connections, switches are often used. Fiber optic switches are used for fiber optic networks, and ordinary Ethernet switches are generally used for Ethernet, i.e. iSCSI storage systems. Switches currently used in network storage systems are generally not specially optimized for network storage applications. For example, for an iSCSI storage system, common Ethernet switches are generally used, and the switch is unknown to upper-layer applications, so it is impossible to optimize for network storage. In the Journal of Electronics, "A Preemptive Short Packet Priority Scheduling Algorithm in Input Queuing" in the fourth issue of 2005 proposed a short packet priority scheduling optimization algorithm for ordinary routers, but not an optimization method for network storage applications.
(三)发明内容:(3) Contents of the invention:
本发明的目的在于提供一种提高网络存储系统的性能和资源利用率的网络存储系统中的IO调度方法。The purpose of the present invention is to provide an IO scheduling method in a network storage system that improves performance and resource utilization of the network storage system.
本发明的目的是这样实现的:它包括以下几个计算机可实现的步骤:The object of the present invention is achieved like this: it comprises the steps that following several computers can realize:
1)一个新的IO命令将要进入发送队列,判断该命令是否为IO读写响应命令或属于数据流控制命令的短命令;1) A new IO command will enter the sending queue, and judge whether the command is an IO read and write response command or a short command belonging to a data flow control command;
2)如果该命令是IO读写响应命令或属于数据流控制命令的短命令,则到步骤3),如果否,则到步骤4);2) If the command is an IO read/write response command or a short command belonging to a data flow control command, then go to step 3), if not, then go to step 4);
3)将该命令添到队列头部,返回到步骤1);3) Add the command to the head of the queue and return to step 1);
4)将该命令添加到队列尾部,返回到步骤1)。4) Add the command to the end of the queue and return to step 1).
本发明的特征在于:它是一种针对网络存储应用的一种优化方法,它将整个IO通路或仅在交换机中传输的IO命令中的读写响应命令或者属于数据流控制命令的短命令优先传送。The present invention is characterized in that it is an optimization method for network storage applications, which prioritizes the entire IO path or only the read and write response commands in the IO commands transmitted in the switch or the short commands belonging to the data flow control command send.
网络存储技术中整个IO通路包括主机,存储网络和存储设备。在读写请求中,存储设备等待从主机传来的读写请求命令,如果可以优先传送这些请求命令,则可以减少存储设备的等待时间,提高存储设备的利用率。当系统已完成所要求的操作后,通常还要等待一个响应命令,才可以释放为完成读写请求所申请的资源。如果可以优先传送这些响应命令,则系统可以尽快的释放资源,将资源提供给后续的读写请求,从而提高了资源利用率。同时,因为这些命令都是不含存储数据的小包,对他们进行优先传送可以提高系统吞吐率。The entire IO path in network storage technology includes hosts, storage networks and storage devices. In the read and write requests, the storage device waits for the read and write request commands from the host. If these request commands can be transmitted preferentially, the waiting time of the storage device can be reduced and the utilization rate of the storage device can be improved. After the system has completed the required operations, it usually waits for a response command before releasing the resources applied for the completion of the read and write requests. If these response commands can be transmitted in priority, the system can release resources as soon as possible and provide resources for subsequent read and write requests, thereby improving resource utilization. At the same time, because these commands are small packets that do not contain stored data, sending them preferentially can improve the system throughput.
模拟试验表明:在整个IO通路上提高读写完毕响应命令的调度优先权可以显著提高存储系统资源利用率和性能。本发明提出针对网络存储应用,将整个IO通路或交换机中传输的IO命令中的读写响应命令或者属于数据流控制的短命令优先传送,达到提高存储系统性能,和提高资源利用率的目的。The simulation test shows that improving the scheduling priority of the response command after reading and writing can significantly improve the resource utilization and performance of the storage system on the entire IO path. The present invention proposes that for network storage applications, the read and write response commands or short commands belonging to data flow control among the IO commands transmitted in the entire IO path or switch are transmitted preferentially, so as to improve the performance of the storage system and the utilization rate of resources.
(四)附图说明(4) Description of drawings
图1为本发明实施例中iSCSI read命令流程图;Fig. 1 is the iSCSI read command flowchart in the embodiment of the present invention;
图2为本发明实施例中iSCSI write命令流程图;Fig. 2 is the iSCSI write command flowchart in the embodiment of the present invention;
图3是网络存储系统中一条IO通路的简图。FIG. 3 is a schematic diagram of an IO path in a network storage system.
(五)具体实施方式:(5) Specific implementation methods:
下面结合附图和具体实施例对本发明作进一步的说明:The present invention will be further described below in conjunction with accompanying drawing and specific embodiment:
在网络存储系统中,以iSCSI存储系统为例,它的读写过程如下:In the network storage system, taking the iSCSI storage system as an example, its read and write process is as follows:
READ命令:READ command:
当启动端用户程序向目标端设备发出读请求时:When an initiator user program issues a read request to a target device:
①将该请求转化为SCSI命令并传递到iSCSI底层驱动(Low Level Driver,LLD)。iSCSILLD收到命令后向目标端发送iSCSI“SCSI Command”PDU。①Convert the request into a SCSI command and pass it to the iSCSI underlying driver (Low Level Driver, LLD). iSCSILLD sends iSCSI "SCSI Command" PDU to the target after receiving the command.
②当目标端前端目标驱动(Front End Target Driver,FETD)收到这个命令后,把它解封装还原为SCSI命令。②When the front end target driver (Front End Target Driver, FETD) of the target end receives this command, it decapsulates it and restores it to a SCSI command.
③将这个SCSI命令传给SCSI目标中间层(SCSI Target Middle Level,STML)。③ Pass this SCSI command to the SCSI Target Middle Level (SCSI Target Middle Level, STML).
④STML把缓冲区中的读数据传回给启动端,即iSCSI“SCSI Data In”PDUs。启动端的iSCSI LLD接收到读数据后,把它保存在已分配好的空缓冲区里。全部数据传输结束后,STML将响应传回给启动端,即iSCSI“SCSI Respond”PDU。④ STML returns the read data in the buffer to the initiator, that is, iSCSI "SCSI Data In" PDUs. After receiving the read data, the iSCSI LLD at the initiator saves it in the allocated empty buffer. After all data transmission is over, STML sends the response back to the initiator, that is, iSCSI "SCSI Respond" PDU.
⑤iSCSI LLD接收到该响应后,将其交给SCSI中层(SCSI Middle Level,SML)处理。iSCSI LLD和STML释放为该命令所分配的全部资源。⑤ After receiving the response, iSCSI LLD hands it over to SCSI Middle Level (SML) for processing. iSCSI LLD and STML release all resources allocated for this command.
结合图1,WRITE命令:Combined with Figure 1, the WRITE command:
当启动端用户程序向目标端设备发出写请求时:When an initiator user program issues a write request to a target device:
①将该请求被转化为SCSI命令并传递到iSCSI LLD。iSCSI LLD向目标端发送iSCSI“SCSI Command”PDU。① The request is converted into a SCSI command and passed to iSCSI LLD. iSCSI LLD sends iSCSI "SCSI Command" PDU to the target.
②当目标端FETD收到这个命令后,把它解封装还原为SCSI命令。②When the target FETD receives this command, it decapsulates it and restores it to a SCSI command.
③将这个SCSI命令传给STML。③ Pass this SCSI command to STML.
④STML分配所需要的缓冲区,向启动端发送iSCSI“Ready to Transfer”PDU,通知发起端可以开始发送数据。④ STML allocates the required buffer, sends iSCSI "Ready to Transfer" PDU to the initiator, and notifies the initiator that it can start sending data.
⑤一旦iSCSI LLD接收到该PDU,就把已经存放在缓冲区里的写数据发送到目标端,即iSCSI“SCSI Data Out”PDUs。⑤ Once the iSCSI LLD receives the PDU, it sends the write data already stored in the buffer to the target, i.e. iSCSI "SCSI Data Out" PDUs.
⑥FETD接收从启动端传来的写数据,并把它保存在已分配好的空缓冲区中。然后,通知STML写数据已经收到。⑥ FETD receives the write data from the start-up terminal and saves it in the allocated empty buffer. Then, notify STML that the write data has been received.
⑦全部数据传输结束后,STML把处理后的响应传回给启动端,即iSCSI“SCSI Response”PDU。⑦ After all data transmission is completed, STML returns the processed response to the initiator, that is, iSCSI "SCSI Response" PDU.
⑧iSCSI LLD接收到该响应后,将其交给SCSI中层(SCSI Middle Level,SML)处理。iSCSILLD和STML释放为该命令所分配的全部资源。⑧ After receiving the response, iSCSI LLD hands it over to SCSI Middle Level (SML) for processing. iSCSILLD and STML release all resources allocated for this command.
结合图2,从iSCSI协议的读写过程可以看出,启动端和目标端之间传输的iSCSI PDU共有两种:一种为包含读写数据的命令,即iSCSI“SCSI Data In”PDUs和iSCSI“SCSI Data Out”PDUs;另一种为与资源等待相关的命令,如请求命令iSCSI“SCSI Command”PDU、响应命令iSCSI“Ready to Transfer”PDU和iSCSI“SCSI Response”PDU。如果优先传送这些短命令,则可以减少资源的等待时间,提高资源利用率。例如在读写完毕后目标端都要向启动端发送iSCSI“SCSI Response”PDU命令,而只有启动端收到这个命令之后,启动端和目标端才能够释放为该读写操作所分配的所有资源。Combined with Figure 2, it can be seen from the reading and writing process of the iSCSI protocol that there are two types of iSCSI PDUs transmitted between the initiator and the target: one is the command containing read and write data, that is, iSCSI "SCSI Data In" PDUs and iSCSI "SCSI Data Out" PDUs; the other is commands related to resource waiting, such as request command iSCSI "SCSI Command" PDU, response command iSCSI "Ready to Transfer" PDU and iSCSI "SCSI Response" PDU. If these short commands are transmitted preferentially, the waiting time of resources can be reduced and resource utilization can be improved. For example, after reading and writing, the target end must send the iSCSI "SCSI Response" PDU command to the initiator end, and only after the initiator end receives this command, the initiator end and the target end can release all resources allocated for the read and write operation .
对于每一个读写操作,启动端和目标端都会有很多次交互过程,而在这些信息传输过程中,IO通路中传输的大多数为请求读写和读写完毕命令或者属于数据流控制命令的短命令。已有文献证明,如果在路由器中提高短包的调度优先级可以提高性能。本发明提出将整个IO通路或仅在交换机中传输的IO命令中的短命令优先传送,以提高性能和资源利用率。For each read and write operation, there will be many interaction processes between the initiator and the target end, and in the process of these information transmissions, most of the transmissions in the IO channel are requests for reading and writing and read and write completion commands or data flow control commands. short order. It has been proved by literature that if the scheduling priority of short packets is increased in the router, the performance can be improved. The present invention proposes to transmit the entire IO path or only the short commands in the IO commands transmitted in the switch first, so as to improve performance and resource utilization.
图3中是网络存储系统中一条IO通路的简图,IO通路包括启动器端的IO命令的网络发送装置1、交换机中处理IO命令的装置2和目标器端的IO命令的网络发送的装置3,其中每个IO命令在被他们处理之前都在等待队列中排队等待。4、5、6分别是启动器,交换机和目标器中的等待队列。Fig. 3 is a schematic diagram of an IO path in a network storage system, the IO path includes a network sending device 1 for IO commands at the initiator side, a
结合图3,在启动器端的IO命令的网络发送装置1、交换机中处理IO命令的装置2或目标器端的IO命令的网络发送的装置3中进行的处理过程都一样,即循环的分别从发送队列4、5或6的队头取出IO命令发送到网络上。其具体流程如下:In conjunction with FIG. 3 , the processing processes carried out in the network sending device 1 of the IO command at the initiator side, the
(1)在发送队列中取出命令。(1) Take out the command in the sending queue.
(2)将该命令发送到正确的网络上。(2) Send the command to the correct network.
(3)到步骤(1)。(3) to step (1).
启动器、交换机或者目标器分别对发送队列4、5、6的处理流程如下:The initiator, switch, or target process the sending
(1)一个新的IO命令将要进入发送队列。(1) A new IO command will enter the sending queue.
(2)判断该命令是否为IO读写响应命令,如请求命令iSCSI“SCSI Command”PDU、响应命令iSCSI“Ready to Transfer”PDU和iSCSI“SCSI Response”PDU,如果是,则到步骤(3),如果否,则到步骤(4)。(2) Determine whether the command is an IO read and write response command, such as request command iSCSI "SCSI Command" PDU, response command iSCSI "Ready to Transfer" PDU and iSCSI "SCSI Response" PDU, if yes, go to step (3) , if not, go to step (4).
(3)将该命令添到队列头部,到步骤(1)。(3) Add the command to the head of the queue, go to step (1).
(4)将该命令添加到队列尾部,到步骤(1)。(4) Add the command to the tail of the queue, go to step (1).
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2007100718360A CN101043445B (en) | 2007-03-06 | 2007-03-06 | IO dispatch method in network storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2007100718360A CN101043445B (en) | 2007-03-06 | 2007-03-06 | IO dispatch method in network storage system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101043445A CN101043445A (en) | 2007-09-26 |
CN101043445B true CN101043445B (en) | 2011-02-23 |
Family
ID=38808657
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2007100718360A Expired - Fee Related CN101043445B (en) | 2007-03-06 | 2007-03-06 | IO dispatch method in network storage system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101043445B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101694610B (en) * | 2009-10-16 | 2011-11-09 | 成都市华为赛门铁克科技有限公司 | Command processing method, device and memory device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1550993A (en) * | 2003-04-16 | 2004-12-01 | ƽ | Read-first caching system and method |
CN1773475A (en) * | 2004-11-12 | 2006-05-17 | 国际商业机器公司 | An arbitration structure and a method for handling a plurality of memory commands |
CN101000589A (en) * | 2006-12-22 | 2007-07-18 | 清华大学 | Adaptive external storage IO performance optimization method |
-
2007
- 2007-03-06 CN CN2007100718360A patent/CN101043445B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1550993A (en) * | 2003-04-16 | 2004-12-01 | ƽ | Read-first caching system and method |
CN1773475A (en) * | 2004-11-12 | 2006-05-17 | 国际商业机器公司 | An arbitration structure and a method for handling a plurality of memory commands |
CN101000589A (en) * | 2006-12-22 | 2007-07-18 | 清华大学 | Adaptive external storage IO performance optimization method |
Also Published As
Publication number | Publication date |
---|---|
CN101043445A (en) | 2007-09-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20240160584A1 (en) | System and method for facilitating dynamic command management in a network interface controller (nic) | |
CN103428226B (en) | Method and system for communication of user state and inner core | |
CN108536543B (en) | Receive queue with stride-based data dispersal | |
US10152441B2 (en) | Host bus access by add-on devices via a network interface controller | |
US10735513B2 (en) | Remote NVMe activation | |
US8850090B2 (en) | USB redirection for read transactions | |
US8856407B2 (en) | USB redirection for write streams | |
CN103888293A (en) | Data channel scheduling method of multichannel FC network data simulation system | |
US10721302B2 (en) | Network storage protocol and adaptive batching apparatuses, methods, and systems | |
US20130185472A1 (en) | Techniques for improving throughput and performance of a distributed interconnect peripheral bus | |
US8275925B2 (en) | Methods and apparatus for improved serial advanced technology attachment performance | |
US9747233B2 (en) | Facilitating routing by selectively aggregating contiguous data units | |
CN114363269B (en) | Message transmission method, system, equipment and medium | |
US7761529B2 (en) | Method, system, and program for managing memory requests by devices | |
US20060004904A1 (en) | Method, system, and program for managing transmit throughput for a network controller | |
CN115904259B (en) | Processing method and related device of nonvolatile memory standard NVMe instruction | |
CN115643318A (en) | Command Execution Method, Device, Equipment, and Computer-Readable Storage Medium | |
US10581748B2 (en) | Information processing apparatus, information processing method, and non-transitory computer-readable storage medium | |
CN101043445B (en) | IO dispatch method in network storage system | |
CN116760510B (en) | A message sending method, message receiving method, device and equipment | |
CN102868684A (en) | Fiber channel target and realizing method thereof | |
CN114328317B (en) | A method, device and medium for improving communication performance of a storage system | |
WO2021035798A1 (en) | Uart main control system for automatically switching outgoing data in multi-core scene | |
JP4349636B2 (en) | Packet processing apparatus and program | |
US20040111537A1 (en) | Method, system, and program for processing operations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20110223 Termination date: 20170306 |
|
CF01 | Termination of patent right due to non-payment of annual fee |