US20060288130A1 - Address window support for direct memory access translation - Google Patents
Address window support for direct memory access translation Download PDFInfo
- Publication number
- US20060288130A1 US20060288130A1 US11/157,675 US15767505A US2006288130A1 US 20060288130 A1 US20060288130 A1 US 20060288130A1 US 15767505 A US15767505 A US 15767505A US 2006288130 A1 US2006288130 A1 US 2006288130A1
- Authority
- US
- United States
- Prior art keywords
- dma
- address
- memory
- translation
- entry
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000013519 translation Methods 0.000 title claims abstract description 94
- 230000014616 translation Effects 0.000 claims abstract description 92
- 230000007246 mechanism Effects 0.000 claims abstract description 16
- 238000000034 method Methods 0.000 claims description 34
- 230000006870 function Effects 0.000 claims description 26
- 238000010586 diagram Methods 0.000 description 10
- 230000002093 peripheral effect Effects 0.000 description 10
- 238000012545 processing Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 239000000872 buffer Substances 0.000 description 6
- 230000004224 protection Effects 0.000 description 6
- 238000013507 mapping Methods 0.000 description 5
- 238000002955 isolation Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000000638 solvent extraction Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000007596 consolidation process Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- KDMWFFHKQUJBLB-UHFFFAOYSA-N n-methyl-1,1-diphenylpropan-2-amine;hydrochloride Chemical compound Cl.C=1C=CC=CC=1C(C(C)NC)C1=CC=CC=C1 KDMWFFHKQUJBLB-UHFFFAOYSA-N 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000000135 prohibitive effect Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/10—Address translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/10—Address translation
- G06F12/1081—Address translation for peripheral access to main memory, e.g. direct memory access [DMA]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
- G06F13/28—Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
Definitions
- the present invention relates generally to microprocessors, more specifically, the present invention relates to input/output (I/O) virtualization.
- I/O management presents a challenge.
- Existing techniques to address the problem of I/O management have a number of disadvantages.
- One technique uses software-only I/O virtualization to support virtual machine (VM) I/O. This technique has limited functionality, performance, and robustness.
- VM virtual machine
- the functionality seen by the guest operating system (OS) and applications is limited by the functionality supported by the virtual devices emulated in the VM monitor (VMM) software.
- the guest I/O operations are trapped by the VMM and proxied or emulated before being submitted to the underlying physical-device hardware, resulting in poor performance.
- FIG. 1 illustrates one embodiment of a computer system
- FIG. 2 illustrates one embodiment of an input/output (I/O) device assignment
- FIG. 3 illustrates one embodiment of virtualization using direct memory access (DMA) remapping
- FIG. 4 illustrates one embodiment of an I/O address translation
- FIG. 5 illustrates one embodiment of a DMA remapping structure
- FIG. 6 illustrates one embodiment of an address window page table entry format
- FIG. 7 illustrates one embodiment of a process for address-window-based DMA address translation
- FIG. 8 illustrates one embodiment of an address window table format
- FIG. 9 illustrates one embodiment of address window flush registers
- FIG. 10 illustrates one embodiment of an address window flush register format
- FIG. 11 illustrates a flow diagram for one embodiment of DMA translation
- FIG. 12 illustrates another embodiment of a computer system.
- DMA Direct Memory Access
- FIG. 1 illustrates one embodiment of a computer system 100 .
- Computer system 100 includes a processor 110 , a processor bus 120 , a memory control hub (MCH) 130 , a system memory 140 , an input/output control hub (ICH) 150 , a peripheral bus 155 , a mass storage device/interface 170 , and input/output devices 180 1 to 180 K , and 185 .
- MCH memory control hub
- ICH input/output control hub
- the processor 110 represents a central processing unit of any type of architecture, such as embedded processors, mobile processors, micro-controllers, digital signal processors, superscalar processors, multi-threaded processors, multi-core processors, vector processors, single instruction multiple data (SIMD) computers, complex instruction set computers (CISC), reduced instruction set computers (RISC), very long instruction word (VLIW), or hybrid architecture.
- SIMD single instruction multiple data
- CISC complex instruction set computers
- RISC reduced instruction set computers
- VLIW very long instruction word
- the processor bus 120 provides interface signals to allow the processor 110 to communicate with other processors or devices, e.g., MCH 130 .
- the processor bus 120 may support a uni-processor or multiprocessor configuration.
- the processor bus 120 may be parallel, sequential, pipelined, asynchronous, synchronous, or any combination thereof.
- MCH 130 provides control and configuration of memory and input/output devices such as the system memory 140 and the ICH 150 .
- MCH 130 may be integrated into a chipset that integrates multiple functionalities such as the isolated execution mode, host-to-peripheral bus interface, memory control.
- MCH 130 interfaces to the peripheral bus 155 directly or via the ICH 150 .
- peripheral buses such as Peripheral Component Interconnect (PCI), PCI Express, accelerated graphics port (AGP), Industry Standard Architecture (ISA) bus, and Universal Serial Bus (USB), etc.
- PCI Peripheral Component Interconnect
- AGP accelerated graphics port
- ISA Industry Standard Architecture
- USB Universal Serial Bus
- MCH 130 includes a direct memory access (DMA) remapping circuit 135 .
- DMA remapping circuit 135 maps an I/O device (e.g., one of the I/O device 180 1 to 180 K and 185 ) into a domain in the system memory 140 in an I/O transaction.
- the I/O transaction is typically a DMA request.
- DMA remapping circuit 135 provides hardware support to facilitate or enhance I/O device assignment and/or management.
- DMA remapping circuit 135 may also be included in any chipset other than MCH 130 , such as ICH 150 . It may also be implemented, partly or wholly, in the processor 110 , or as a separate processor or co-processor to other processors or devices.
- the system memory 140 stores system code and data.
- the system memory 140 is typically implemented with dynamic random access memory (DRAM) or static random access memory (SRAM).
- the system memory may include program code or code segments implementing one embodiment of the invention.
- the system memory includes an operating system (OS) 142 , or a portion of the OS, or a kernel, and an I/O driver 145 . Any one of the elements of the OS 142 or the I/O driver 145 may be implemented by hardware, software, firmware, microcode, or any combination thereof.
- the system memory 140 may also include other programs or data which are not shown.
- ICH 150 has a number of functionalities that are designed to support I/O functions. ICH 150 may also be integrated into a chipset together or separate from the MCH 130 to perform I/O functions. ICH 150 may include a number of interface and I/O functions such as PCI bus interface to interface to the peripheral bus 155 , processor interface, interrupt controller, direct memory access (DMA) controller, power management logic, timer, system management bus (SMBus), universal serial bus (USB) interface, mass storage interface, low pin count (LPC) interface, etc.
- PCI bus interface to interface to the peripheral bus 155 , processor interface, interrupt controller, direct memory access (DMA) controller, power management logic, timer, system management bus (SMBus), universal serial bus (USB) interface, mass storage interface, low pin count (LPC) interface, etc.
- DMA direct memory access
- SMB system management bus
- USB universal serial bus
- LPC low pin count
- the mass storage device/interface 170 provides storage of archive information such as code, programs, files, data, applications, and operating systems.
- the mass storage device/interface 170 may interface to a compact disk (CD) ROM 172 , a digital video/versatile disc (DVD) 173 , a floppy drive 174 , and a hard drive 176 , and any other magnetic or optic storage devices.
- the mass storage device/interface 170 provides a mechanism to read machine-accessible media.
- the machine-accessible media may contain computer readable program code to perform tasks as described in the following.
- the I/O devices 180 1 to 180 K may include any I/O devices to perform I/O functions including DMA requests. They are interfaced to the peripheral bus 155 . Examples of I/O devices 180 1 to 180 K include controller for input devices (e.g., keyboard, mouse, trackball, pointing device), media card (e.g., audio, video, graphics), network card, and any other peripheral controllers.
- the I/O device 185 is interfaced directly to the ICH 150 .
- the peripheral bus 155 is any bus that supports I/O transactions. Examples of the peripheral bus 155 include the PCI bus, PCI Express, etc.
- Elements of one embodiment of the invention may be implemented by hardware, firmware, software or any combination thereof.
- hardware generally refers to an element having a physical structure such as electronic, electromagnetic, optical, electro-optical, mechanical, electro-mechanical parts, etc.
- software generally refers to a logical structure, a method, a procedure, a program, a routine, a process, an algorithm, a formula, a function, an expression, etc.
- firmware generally refers to a logical structure, a method, a procedure, a program, a routine, a process, an algorithm, a formula, a function, an expression, etc that is implemented or embodied in a hardware structure (e.g., flash memory, read only memory, erasable read only memory).
- firmware may include microcode, writable control store, micro-programmed structure.
- the elements of an embodiment of the present invention are essentially the code segments to perform the necessary tasks.
- the software/firmware may include the actual code to carry out the operations described in one embodiment of the invention, or code that emulates or simulates the operations.
- the program or code segments can be stored in a processor or machine accessible medium or transmitted by a computer data signal embodied in a carrier wave, or a signal modulated by a carrier, over a transmission medium.
- the “processor readable or accessible medium” or “machine readable or accessible medium” may include any medium that can store, transmit, or transfer information.
- Examples of the processor readable or machine accessible medium include an electronic circuit, a semiconductor memory device, a read only memory (ROM), a flash memory, an erasable ROM (EROM), a floppy diskette, a compact disk (CD) ROM, an optical disk, a hard disk, a fiber optic medium, a radio frequency (RF) link, etc.
- the computer data signal may include any signal that can propagate over a transmission medium such as electronic network channels, optical fibers, air, electromagnetic, RF links, etc.
- the code segments may be downloaded via computer networks such as the Internet, intranet, etc.
- the machine accessible medium may be embodied in an article of manufacture.
- the machine accessible medium may include data that, when accessed by a machine, cause the machine to perform the operations described in the following.
- the machine accessible medium may also include program code embedded therein.
- the program code may include machine readable code to perform the operations described in the following.
- the term “data” here refers to any type of information that is encoded for machine-readable purposes. Therefore, it may include program, code, data, file, etc.
- All or part of an embodiment of the invention may be implemented by hardware, software, or firmware, or any combination thereof.
- the hardware, software, or firmware element may have several modules coupled to one another.
- a hardware module is coupled to another module by mechanical, electrical, optical, electromagnetic or any physical connections.
- a software module is coupled to another module by a function, procedure, method, subprogram, or subroutine call, a jump, a link, a parameter, variable, and argument passing, a function return, etc.
- a software module is coupled to another module to receive variables, parameters, arguments, pointers, etc. and/or to generate or pass results, updated variables, pointers, etc.
- a firmware module is coupled to another module by any combination of hardware and software coupling methods above.
- a hardware, software, or firmware module may be coupled to any one of another hardware, software, or firmware module.
- a module may also be a software driver or interface to interact with the operating system running on the platform.
- a module may also be a hardware driver to configure, set up, initialize, send and receive data to and from a hardware device.
- An apparatus may include any combination of hardware, software, and firmware modules.
- One embodiment of the invention may be described as a process which is usually depicted as a flowchart, a flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed. A process may correspond to a method, a program, a procedure, a method of manufacturing or fabrication, etc.
- the I/O subsystem components function as part of a single domain and are managed by the operating-system software.
- One embodiment of the invention provides the hardware support required to assign I/O devices in a computing platform to multiple domains.
- a domain is abstractly defined as an isolated environment in the platform, to which a sub-set of the host-physical memory is allocated.
- the host-physical memory is included in the system memory 140 .
- I/O devices that are allowed to directly access the physical memory that is allocated to a domain are referred to as the domain's assigned devices.
- the isolation property of a domain is achieved by blocking access to its physical memory from resources not assigned to it. Multiple isolated domains are supported by ensuring all I/O devices are assigned to some domain (possibly a default domain), and by restricting access from each assigned device only to the physical memory allocated to its domain. Domains may share resources (e.g., memory, I/O devices) or be completely isolated from each other at the discretion of the software or other entity performing the partitioning.
- Each domain has a view of physical memory, or a physical address space, that may be different than the system view of physical memory.
- An address used by a domain's resources to access its physical address space is referred to as a guest-physical address (GPA).
- the host-physical address (HPA) refers to the system physical address used to access memory.
- a domain is considered relocated if one or more of its GPAs must be translated to a new HPA which differs from the GPA to access its allocated system physical memory.
- a domain is referred to as non-relocated if all of its guest-physical addresses are the same as the host-physical addresses used to access its allocated system physical memory.
- Both relocated and non-relocated domains may be allocated a subset of the available system physical memory and may be prevented from accessing certain portions of the memory.
- Physical memory protection and partitioning requires a physical-address translation mechanism and a protection mechanism that can validate guest-physical addresses generated by a domain's assigned devices, including processors and I/O devices, and translate it to valid host-physical addresses.
- the DMA remapping circuit 135 provides this support.
- DMA remapping For assigning I/O devices to domains, physical-address translation and protection are applied for DMA requests from all I/O devices in the platform.
- the physical address translation functionality for I/O device DMA requests is referred to as DMA remapping.
- DMA remapping also includes protection mechanisms in addition to the mapping of addresses from one address space to another (e.g., guest-physical addresses to host-physical addresses).
- FIG. 2 is a diagram illustrating one embodiment of I/O device assignment.
- the I/O device assignment is a mapping of an I/O device to a domain in the system memory 140 .
- the mapping is supported by DMA remapping circuit 135 .
- device A 210 is mapped into domain 1 240 in the system memory 140 .
- the domain 1 may have two drivers 242 and 244 for the device A 210 .
- the DMA remapping circuit 135 includes a register set 220 , a DMA remapping structure 222 , and a logic circuit 224 .
- the register set 220 includes a number of registers that provides control or status information used by the DMA remapping structure 222 , the logic circuit 224 , and the programs or drivers for the I/O devices.
- the DMA remapping structure 222 provides the basic structure, storage, or tables used in the remapping or address translation of the guest-physical address to the host-physical address in an appropriate domain.
- the logic circuit 224 includes circuitry that performs the remapping or address translation operations and other interfacing functions.
- the DMA remapping circuit 135 may have different implementations to support different configurations and to provide different capabilities for the remapping or address translation operations.
- I/O device assignment and/or management using the DMA remapping circuit 135 provides a number of usages or applications. Two useful applications are OS robustness applications and virtualization applications.
- OS Robustness applications Domain isolation has multiple uses for operating-system software. For example, an OS may define a domain containing its critical code and data structures in memory, and restrict access to this domain from all I/O devices in the system. This allows the OS to limit erroneous or unintended corruption of data and code through incorrect programming of devices by device drivers, or certain classes of device failures thereby improving its robustness. Alternatively, an OS may allow a subset of trusted devices to access critical code and data structures in memory but disallow access from other devices.
- the OS may use domains to better manage DMA from legacy 32-bit PCI devices to high memory (above 4 GB). This is achieved by allocating 32-bit devices to one or more domains and programming the I/O-physical-address-translation mechanism to remap the DMA from these devices to high memory. Without such support, the software has to resort to data copying through OS bounce buffers.
- an OS may manage I/O by creating multiple domains and assigning one or more I/O devices to the individual domains.
- the device drivers explicitly register their I/O buffers with the OS, and the OS assigns these I/O buffers to specific domains, using hardware to enforce the DMA domain protections.
- the OS uses the I/O address translation and protection mechanism as an I/O memory management unit (I/O MMU).
- Virtualization applications The virtualization technology allows for the creation of one or more virtual machines (VMs) on a single system. Each VM may run simultaneously utilizing the underlying physical hardware resources. Virtual machines allow multiple operating system instances to run on the same processor offering benefits such as system consolidation, legacy migration, activity partitioning and security.
- VMs virtual machines
- Virtual machines allow multiple operating system instances to run on the same processor offering benefits such as system consolidation, legacy migration, activity partitioning and security.
- Virtualization architectures typically involve two principal classes of software components: (a) Virtual machine monitors (VMMs) and (b) Virtual Machines (VMs).
- VMMs Virtual machine monitors
- VMs Virtual Machines
- the VMM software layer runs at the highest privilege level and has complete ownership of the underlying system hardware.
- the VMM allows the VMs to share the underlying hardware and yet provides isolation between VMs.
- the limitations of software-only methods for I/O virtualization can be removed by direct assignment of I/O devices to VMs using DMA remapping circuit 135 .
- direct assignment of devices the driver for an assigned I/O device runs only in the VM to which it is assigned and is allowed to interact directly with the device hardware without trapping to the VMM.
- the hardware support enables DMA remapping without device specific knowledge in the VMM.
- the VMM restricts itself to a controlling function where it explicitly does the set-up and tear-down of device assignment to VMs. Rather than trapping to the VMM for guest I/O accesses as in the case of software-only methods for I/O virtualization, the VMM requires the guest I/O access trapping only to protect specific resources such as device configuration space accesses, interrupt management etc., that impact system functionality.
- a VMM manages DMA from I/O devices.
- the VMM may map itself to a domain, and map each VM to an independent domain.
- the I/O devices can be assigned to domains, and the physical address translation hardware provided by the DMA remapping circuit 135 may be used to allow the DMA from I/O devices only to the physical memory assigned to the assigned VM's domain.
- the DMA remapping circuit 135 can be programmed to do the necessary GPA-to-HPA translation.
- VMM implementations can choose a combination of software-only I/O virtualization methods and direct device assignment for presenting I/O device resources to a VM.
- FIG. 3 is a diagram illustrating one embodiment of virtualization using DMA remapping.
- the virtualization includes two devices A and B 310 and 312 , the DMA remapping circuit 135 , a VMM or hosting OS 320 , VM 0 340 and VM n 360 .
- the two devices A and B 310 and 312 are two I/O devices that are supported by the two VM 340 and 360 , respectively.
- DMA remapping circuit 135 directly maps these two devices to the respective VM's 340 and 360 without specific knowledge of the VMM or hosting OS 320 . More or fewer I/O devices and VMs may be supported.
- the VMM or the hosting OS 320 provides support for the underlying hardware of the platform or the system on which it is executing.
- VMs 340 and 360 have similar architectural components but are completely isolated from each other. They are interfaced to the VMM or hosting OS 320 to access to the system hardware.
- VM 340 includes applications 342 and 344 . More or fewer applications may be supported. It has a guest OS 346 and a device A driver 350 .
- the device A driver 350 is a driver that drives, controls, interfaces, or supports the device A 310 .
- VM 360 includes applications 362 and 364 . More or fewer applications may be supported. It has a guest OS 366 and a device B driver 370 .
- the guest OS 366 may be the same or different than the guest OS 346 in the VM 340 .
- the device B driver 370 is a driver that drives, controls, interfaces, or supports the device B 312 .
- the DMA remapping architecture provided by the DMA remapping circuit 135 facilitates the assigning of I/O devices to an arbitrary number of domains. Each domain has a physical address space that may be different than the system physical address space.
- the DMA remapping provides the transformation of guest-physical address (GPA) in DMA requests from an I/O device to the corresponding host-physical address (HPA) allocated to its domain.
- GPA guest-physical address
- the platform may support one or more I/O physical address translation hardware units.
- Each translation hardware unit supports remapping of the I/O transactions originating from within its hardware scope.
- a desktop chipset implementation may expose a single DMA remapping hardware unit that translates all I/O transactions at the memory controller hub (MCH) component.
- MCH memory controller hub
- a server platform with one or more core chipset components may support independent translation hardware units in each component, each translating DMA requests originating within its I/O hierarchy.
- the architecture supports configurations where these hardware units may share the same translation data structures in system memory or use independent structures depending on software programming.
- the chipset DMA remapping circuit 135 treats the addresses in DMA requests as guest-physical addresses (GPA). DMA remapping circuit 135 may apply the address translation function to the incoming address to convert it to a host-physical address (HPA) before further hardware processing, such as snooping of processor caches or forwarding to the memory controller.
- GPA guest-physical addresses
- the address translation function implemented by DMA remapping circuit 135 depends on the physical-memory management supported by the VMM. For example, in usages where the software does host-physical memory allocations as contiguous regions, the DMA translation for converting GPA to HPA may be a simple offset addition. In usages where the VMM manages physical memory at page granularity, DMA remapping circuit 135 may use a memory-resident address translation data structure.
- FIG. 4 is a diagram illustrating one embodiment of an I/O address translation.
- the I/O address translation includes two I/O devices 1 and 2 410 and 412 , the DMA remapping circuit 135 , a physical memory 420 , and a guest view 430 .
- the I/O devices 1 and 2 410 and 412 are assigned to two separate domains. They perform I/O requests or DMA requests to addresses DMA_ADR.
- DMA remapping circuit 135 maps these two devices to corresponding domains allocated in the physical memory 420 .
- the physical memory 420 is partitioned into memory segments 422 and 424 and memory segments 426 and 428 . More or fewer allocated memory segments may be assigned to one or more of the domains.
- memory segments 422 and 424 are assigned to domain 1 442 and correspond to device 1 410 and memory segments 424 and 428 are assigned to domain 2 444 and correspond to device 1 412 .
- device 1 410 is mapped to the domain 1 422 and the device 2 412 is mapped or assigned to the domain 2 428 .
- the guest view 430 is a logical view from the guest I/O devices. It includes domain 1 442 and domain 2 444 .
- the domain 1 442 corresponds to the two memory segments 422 and 424 in the physical memory 420 .
- the domain 2 444 corresponds to the two memory segments 426 and 428 .
- domains may be allocated portions of the guest view 430 of physical memory. Each of the domains may be assigned to one or more I/O devices.
- the DMA_ADR address from the device 1 410 is mapped to the DMA_ADR 1 located within the address space from 0 to L of the domain 1 442 .
- the DMA_ADR address from the device 2 412 is mapped to the DMA_ADR 2 located within the address space from 0 to K of the domain 2 444 .
- the software responsible for the creation and management of the domains allocates the physical memory 420 for both domains and sets up the GPA-to-HPA address translation function in the DMA remapping circuit 135 .
- the DMA remapping circuit 135 translates the GPAS generated by the devices 410 and 412 to the appropriate HPAs.
- FIG. 5 is a diagram illustrating one embodiment of a DMA remapping structure 222 .
- DMA remapping structure 222 receives a source identifier 510 and includes a root table 520 , a number of context tables, of which two are shown 530 0 and 530 m , and a number of address translation structures, of which two are shown 540 0 and 540 m .
- the remapping structure 222 receives the source identifier 510 and a guest-physical address from the I/O device, and translates the guest-physical address in an assigned domain to a host-physical address.
- the translation may be performed using translation tables arranged in a hierarchical manner.
- the translation mechanism starts from the root table 520 and traverses, or walks, through the context tables (e.g., 530 0 and 530 m ) and the address translation structures (e.g., 540 0 and 540 m ).
- the requester identity of the I/O transactions appearing at DMA remapping circuit 135 determines the originating device and the domain that the originating I/O device is assigned to.
- the source identifier 510 is the attribute identifying the originator of an I/O transaction.
- DMA remapping circuit 135 may determine the source identifier 510 of a transaction in implementation specific ways. For example, some I/O bus protocols may provide the originating device identity as part of each I/O transaction. In other cases, such as for chipset integrated devices, the source identifier 510 may be implied based on the chipset's architecture or implementation.
- source identifier 510 is mapped to the requestor identifier provided as part of the I/O transaction header.
- the requestor identifier of a device includes its PCI Bus/Device/Function numbers assigned by the configuration software and uniquely identifies the hardware function that initiates the I/O request.
- the source identifier 510 includes a function number 512 , a device number 514 , and a bus number 516 .
- the function number 512 is K-bit wide
- the device number 514 is L-bit wide
- the bus number 516 is M-bit wide.
- the bus number 516 identifies the bus on which the I/O transaction is generated.
- the device number 514 identifies the specific device on the identified bus.
- the function number 512 identifies the specific function of the I/O device.
- the source identifier 510 is used to index or look up the root table 520 and the context tables (e.g., 530 0 and 530 m ). In the example illustrated in FIG. 5 , their paths through the DMA remapping structure 222 are illustrated for two I/O transactions using bus 0 and bus m, respectively.
- the root table 520 stores root entries 525 0 to 525 2 ⁇ M ⁇ 1 indexed by the source identifier 510 , or the bus number 516 of the source identifier 510 .
- the root entries function as the top level structure to map devices on a specific bus to its respective parent domain.
- the root entry 0 525 0 corresponds to the I/O transaction using bus 0 .
- the root entry m 525 m corresponds to the I/O transaction using bus m.
- the root entries 0 525 0 and 525 m point to the context tables 530 0 and 530 m , respectively. In one embodiment, these entries provide the base address for the corresponding context table.
- the context tables 530 (e.g., 530 0 and 530 m ) store context entries 535 (e.g., 535 0 and 535 m ) referenced by the root entries.
- the context entries 535 map the I/O devices to their corresponding domain(s).
- the device number 514 and the function number 512 are used to obtain the context entry corresponding to the I/O transaction. In one embodiment, they form an index to point to, or reference, the context table referenced by the corresponding root entry.
- the two context entries for the two I/O transactions are the context entry 535 0 in the context table 530 0 and the context entry 535 m in the context table 530 m .
- the context entries 535 0 and 535 m point to the address translation structures 540 0 and 540 m , respectively.
- the address translation structures 540 (e.g., 540 0 and 540 m ) provide the address translation to the host-physical address using the guest-physical address corresponding to the I/O transaction.
- Each of the address translation structures may be a multi-table 550 , a single table 560 , or a base/bound 570 corresponding to the three translation mechanisms using multi tables, single table, and base/bound translations, respectively.
- a regular page size of 4 KB is used. As is known by one skilled in the art, any other sizes may also be used.
- DMA remapping circuit 135 has a number of registers included in register set 220 shown in FIG. 2 .
- Register set 220 is located in the host-physical address space through a Base Address Register (BAR).
- BAR Base Address Register
- the translation hardware BAR is exposed to software in an implementation dependent manner. This may be exposed as a PCI configuration space register in one of the chipset integrated devices, such as the memory controller device.
- the BAR provides a minimum of 4K address window.
- a register in the register set 220 may have a number of fields. A field may be asserted or negated.
- assertion implies that the bit is set to a defined logical state (e.g., TRUE, logical one) and negation implies that the bit is reset to a defined logic state that is complementary to the state of the assertion (e.g., FALSE, logical zero).
- FALSE logical zero
- the use of an asserted or negated state is arbitrary.
- a field may be asserted to indicate a first state and negated to indicate a second state, or vice versa.
- a field in a register may be programmed, initialized, or configured by DMA remapping circuit 135 and/or by the software. It may also correspond to a specialized hardware circuit or a functionality implemented by a data structure, a function, a routine, or a method.
- fields are grouped into registers. The grouping, formatting, or organization of these fields or bits in the following registers is for illustrative purposes. Other ways of grouping, formatting, or organizing these fields may be used.
- a field may also be duplicated in more than one registers.
- a register may have more or fewer than the fields as described.
- registers may be implemented in a number of ways, including as storage elements or memory elements.
- the DMA remapping architecture described above includes DMA that is translated using single or multiple level page tables (TLBs), as shown in FIG. 5 .
- TLBs single or multiple level page tables
- Such an architecture is suitable for legacy software usages (e.g., where the OS or VMM doesn't know about driver DMA usages).
- single or multiple level page table translations may offer good-to-average DMA performance for most I/O devices (as measured by DMA throughput).
- DMA throughput e.g., such a system has limitations.
- Another limitation is that for non-legacy software usages (e.g., newer OSs and VMMs) that may know more about driver DMA usages, the current architecture does not provide any means for software to provide DMA usage hints to improve DMA-remapping performance.
- non-legacy software usages e.g., newer OSs and VMMs
- the current architecture does not provide any means for software to provide DMA usage hints to improve DMA-remapping performance.
- DMA remapping circuit 135 is configured to support address window-based address translation in addition to the single and multi-level page-table based address translation.
- each DMA remapping circuit 135 may support a number of address windows, with the exact number of address windows supported being a function of hardware implementation.
- the system firmware assigns an address window (AW) range (start and end AW numbers) for each DMA remapping circuit 135 .
- AW address window
- the chipset supports an additional caching structure in addition to existing remapping circuit 135 caching structures.
- additional structures are referred to as AWPTR tables implemented for address window translations. AWPTR tables will be discussed below in greater detail.
- a device-physical address refers to a target address specified by I/O devices in its DMA requests.
- the DPA address space spans across all I/O devices in the computer system and is sub-divided into multiple AWs.
- each AW covers a contiguous 2 MB region of DPA space.
- an AW 0 may cover DPA 0 to 2 MB
- an AW 1 may cover DPA 2 MB to 4 MB, etc.
- each AW is described by a DPA-to-HPA translation structure in memory called an Address Window page-table (AWPT).
- the entries in an AWPT are called AW page-table-entries (AWPTE).
- Each AWPTE provides the translation for a 4 KB region (referred as a slot) within the AW.
- slot there are 512 slots in an AW
- the AWPT associated with each AW is 4 KB in size (with 512 AWPTEs).
- AWPTEs are 64-bits in size and have the format illustrated in FIG. 6 .
- an AWPTE includes access control bits such as bits (“read” and “write”) specifying if read accesses and/or write accesses are allowed to the DPA used to access the AWPTE.
- the address field (“ADDR”) specifies the mapping of a subset of the bits in the DPA to HPA.
- the remaining bits e.g., bits [11:0] may be passed unmodified from the DPA to the HPA.
- the system software can bind one or more AWs to specific I/O devices.
- the driver registers its DMA buffers (e.g., in the host-physical address space) with the AW bound to the device to generate a DPA mapping.
- the driver for a device identifies its target buffers to the device hardware using its DPA. Addresses in DMA requests from the device are DPAs that are translated by remapping circuit 135 based on the AW bound to the device and the DPA-to-HPA translations for the address window.
- FIG. 7 illustrates one embodiment of a process for translating DMA addresses in the presence of address windows.
- FIG. 7 illustrates the use of AWPTs and AWPTR tables to provide the translation.
- the AWPTR table in remapping circuit 135 has as many entries as the number of address windows supported by its remapping circuit 135 . Entries in an AWPTR table are associated with a specific AW configured on its remapping circuit 135 .
- each AWPTR table entry includes the HPA to the base of the AWPT for the particular AW.
- Each AWPTR table entry is tagged with the device-id of the I/O device to which the associated AW is allocated.
- the AWPTR table structure is memory-mapped to allow software to modify entries in it.
- the base address of the AWPTR table is referred as AWPTR_TABLE_BASE.
- AWPTR table entries are called AWPTRs, and a specific entry at a particular index in the cache is notated as AWPTR[index].
- FIG. 8 illustrates one embodiment of an AWPTR table structure for a chipset implementing two remapping circuits 135 , with each configured to support two AWs (4 to 5, and 6 to 7, respectively).
- the AWPTR table is stored in registers, the table appears to software to reside at an address specified by AWPTR_TABLE_BASE.
- each entry in the AWPTR table includes Valid, Tag and Data fields.
- the Valid field indicates whether an entry is valid. In another embodiment, there is no valid bit and the remapping circuit 135 treats all AWPTR table entries as being valid.
- the Tag field indicates the particular device ID to which the entry is associated. For example, in FIG. 8 software has bound AW 4 to an I/O device with device ID 11 and AW 6 to an I/O device with device ID 18 .
- identification of a device originating an access may include information on the bus, device and function within the device.
- the originator of a DMA request is referred to herein as a “device” or “requesting I/O device” and is identified by a “device ID”. However, it should be understood that in other embodiments a single physical device may be identified by one or more device-IDs.
- the remapping circuit 135 supports a set of 16-bit memory mapped registers, called AW_FLUSH registers.
- FIG. 9 illustrates one embodiment of AW_FLUSH flush registers.
- the AW_FLUSH flush registers are implemented as a mechanism for software to invalidate translations which may be cached by one or more elements of the remapping circuit 135 .
- one AW_FLUSH flush register is implemented for each AW supported by a remapping circuit 135 .
- the base address of this memory-mapped register range (AW_FLUSH_BASE) is initialized by platform firmware. For example, as illustrated in FIG. 9 , if a chipset component supports two remapping circuits 135 and the remapping circuits 135 support AWs 4 to 5 and 6 to 7 , respectively, the chipset supports a total of 4 AW_FLUSH registers.
- FIG. 10 illustrates one embodiment of a format for each AW_FLUSH registers.
- address window based translation occurs if specified in the context entry for an I/O device.
- the context entries may be cached by the remapping circuit 135 , eliminating the need to access memory to make this determination.
- the context entry caching structure is directly accessible by software, allowing software to pre-populate the cache to reduce latencies for the first access to a context entry. In this way, software can guarantee that the worst-case memory access behavior for particular devices is limited to a single memory access, as described below.
- AW# address window number
- the DMA request is either completed to the HPA specified in the AWPTE, or it is blocked.
- the AWPTE processing is similar to how the leaf PTEs are processed in the remapping circuit 135 for the multi-level I/O page-tables described. If the translation succeeds, in one embodiment, it is cached by the remapping circuit 135 in an I/O translation-lookaside buffer (I/O TLB).
- I/O TLB I/O translation-lookaside buffer
- FIG. 11 is a flow diagram illustrating one embodiment of the operation of a remapping circuit 135 performing address window based translation, and single and multi-level page-table based address translation.
- an I/O device generates a DMA request.
- the DMA-request is processed conventionally via a remapping circuit 135 .
- the context-cache is looked up to determine the translation behavior for the device, processing block 1150 .
- decision block 1160 it is determined whether the translation is to be blocked or processed through single-level or multi-level page-tables. If the translation is to be blocked or processed through single-level or multi-level page-tables, the request is processed as described in the conventional remapping circuit 135 architecture described above with respect to FIGS. 1-5 , processing block 1170 . However, if the translation is not to be blocked or processed through single-level or multi-level page-tables, the context-entry for the device specifies address window based translation. Consequently, address window based translation is performed as discussed above with respect to FIG. 7 , processing block 1180 .
- FIG. 12 illustrates another embodiment of computer system 100 .
- the chipset includes a single control hub 1230 as opposed to a separate MCH and ICH.
- memory control is located in processor 110 . Consequently, system memory 140 is coupled to processor 110 .
- the remapping circuit 135 is included in the controller hub 1230 . In another embodiment, remapping circuit 135 is included in processor 110 or in the system memory 140 .
- the above described remapping architecture enables 4K granular DMA address translations similar to multi-level page-tables, and yet offers a worst case performance guarantee which is limited to the overheads associated with a single memory lookup.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Bus Control (AREA)
Abstract
A apparatus is disclosed. The apparatus includes a remapping circuit to facilitate access of one or more I/O devices to a memory device for direct memory access (DMA) transactions. The remapping circuit includes a translation mechanism to perform memory address translations for I/O DMA transactions via address window-based translations.
Description
- The present invention relates generally to microprocessors, more specifically, the present invention relates to input/output (I/O) virtualization.
- As microprocessor architecture becomes more and more complex to support high performance applications, I/O management presents a challenge. Existing techniques to address the problem of I/O management have a number of disadvantages. One technique uses software-only I/O virtualization to support virtual machine (VM) I/O. This technique has limited functionality, performance, and robustness.
- The functionality seen by the guest operating system (OS) and applications is limited by the functionality supported by the virtual devices emulated in the VM monitor (VMM) software. The guest I/O operations are trapped by the VMM and proxied or emulated before being submitted to the underlying physical-device hardware, resulting in poor performance.
- In addition, all or parts of the device driver for the hardware device are run as part of the privileged VMM software, which may adversely affect overall robustness of the platform. Techniques using specialized translation structures can only support a specific device or a limited usage model. General I/O memory management units provide only support for I/O virtual address spaces of limited size or complexity.
- The present invention will be understood more fully from the detailed description given below and from the accompanying drawings of various embodiments of the invention. The drawings, however, should not be taken to limit the invention to the specific embodiments, but are for explanation and understanding only.
-
FIG. 1 illustrates one embodiment of a computer system; -
FIG. 2 illustrates one embodiment of an input/output (I/O) device assignment; -
FIG. 3 illustrates one embodiment of virtualization using direct memory access (DMA) remapping; -
FIG. 4 illustrates one embodiment of an I/O address translation; -
FIG. 5 illustrates one embodiment of a DMA remapping structure; -
FIG. 6 illustrates one embodiment of an address window page table entry format; -
FIG. 7 illustrates one embodiment of a process for address-window-based DMA address translation; -
FIG. 8 illustrates one embodiment of an address window table format; -
FIG. 9 illustrates one embodiment of address window flush registers; -
FIG. 10 illustrates one embodiment of an address window flush register format; -
FIG. 11 illustrates a flow diagram for one embodiment of DMA translation; and -
FIG. 12 illustrates another embodiment of a computer system. - A Direct Memory Access (DMA) translation architecture implementing address window based translation is described. Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
- In the following description, numerous details are set forth. It will be apparent, however, to one skilled in the art, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring the present invention.
-
FIG. 1 illustrates one embodiment of acomputer system 100.Computer system 100 includes aprocessor 110, aprocessor bus 120, a memory control hub (MCH) 130, asystem memory 140, an input/output control hub (ICH) 150, aperipheral bus 155, a mass storage device/interface 170, and input/output devices 180 1 to 180 K, and 185. Note that thesystem 100 may include more or fewer elements than the above. - The
processor 110 represents a central processing unit of any type of architecture, such as embedded processors, mobile processors, micro-controllers, digital signal processors, superscalar processors, multi-threaded processors, multi-core processors, vector processors, single instruction multiple data (SIMD) computers, complex instruction set computers (CISC), reduced instruction set computers (RISC), very long instruction word (VLIW), or hybrid architecture. - The
processor bus 120 provides interface signals to allow theprocessor 110 to communicate with other processors or devices, e.g., MCH 130. Theprocessor bus 120 may support a uni-processor or multiprocessor configuration. Theprocessor bus 120 may be parallel, sequential, pipelined, asynchronous, synchronous, or any combination thereof. - MCH 130 provides control and configuration of memory and input/output devices such as the
system memory 140 and the ICH 150. MCH 130 may be integrated into a chipset that integrates multiple functionalities such as the isolated execution mode, host-to-peripheral bus interface, memory control. MCH 130 interfaces to theperipheral bus 155 directly or via the ICH 150. For clarity, not all the peripheral buses are shown. It is contemplated that thesystem 100 may also include peripheral buses such as Peripheral Component Interconnect (PCI), PCI Express, accelerated graphics port (AGP), Industry Standard Architecture (ISA) bus, and Universal Serial Bus (USB), etc. - MCH 130 includes a direct memory access (DMA)
remapping circuit 135.DMA remapping circuit 135 maps an I/O device (e.g., one of the I/O device 180 1 to 180 K and 185) into a domain in thesystem memory 140 in an I/O transaction. The I/O transaction is typically a DMA request.DMA remapping circuit 135 provides hardware support to facilitate or enhance I/O device assignment and/or management.DMA remapping circuit 135 may also be included in any chipset other thanMCH 130, such as ICH 150. It may also be implemented, partly or wholly, in theprocessor 110, or as a separate processor or co-processor to other processors or devices. - The
system memory 140 stores system code and data. Thesystem memory 140 is typically implemented with dynamic random access memory (DRAM) or static random access memory (SRAM). The system memory may include program code or code segments implementing one embodiment of the invention. The system memory includes an operating system (OS) 142, or a portion of the OS, or a kernel, and an I/O driver 145. Any one of the elements of the OS 142 or the I/O driver 145 may be implemented by hardware, software, firmware, microcode, or any combination thereof. Thesystem memory 140 may also include other programs or data which are not shown. - ICH 150 has a number of functionalities that are designed to support I/O functions. ICH 150 may also be integrated into a chipset together or separate from the
MCH 130 to perform I/O functions. ICH 150 may include a number of interface and I/O functions such as PCI bus interface to interface to theperipheral bus 155, processor interface, interrupt controller, direct memory access (DMA) controller, power management logic, timer, system management bus (SMBus), universal serial bus (USB) interface, mass storage interface, low pin count (LPC) interface, etc. - The mass storage device/
interface 170 provides storage of archive information such as code, programs, files, data, applications, and operating systems. The mass storage device/interface 170 may interface to a compact disk (CD)ROM 172, a digital video/versatile disc (DVD) 173, afloppy drive 174, and ahard drive 176, and any other magnetic or optic storage devices. The mass storage device/interface 170 provides a mechanism to read machine-accessible media. The machine-accessible media may contain computer readable program code to perform tasks as described in the following. - The I/O devices 180 1 to 180 K may include any I/O devices to perform I/O functions including DMA requests. They are interfaced to the
peripheral bus 155. Examples of I/O devices 180 1 to 180 K include controller for input devices (e.g., keyboard, mouse, trackball, pointing device), media card (e.g., audio, video, graphics), network card, and any other peripheral controllers. The I/O device 185 is interfaced directly to theICH 150. Theperipheral bus 155 is any bus that supports I/O transactions. Examples of theperipheral bus 155 include the PCI bus, PCI Express, etc. - Elements of one embodiment of the invention may be implemented by hardware, firmware, software or any combination thereof. The term hardware generally refers to an element having a physical structure such as electronic, electromagnetic, optical, electro-optical, mechanical, electro-mechanical parts, etc. The term software generally refers to a logical structure, a method, a procedure, a program, a routine, a process, an algorithm, a formula, a function, an expression, etc. The term firmware generally refers to a logical structure, a method, a procedure, a program, a routine, a process, an algorithm, a formula, a function, an expression, etc that is implemented or embodied in a hardware structure (e.g., flash memory, read only memory, erasable read only memory). Examples of firmware may include microcode, writable control store, micro-programmed structure. When implemented in software or firmware, the elements of an embodiment of the present invention are essentially the code segments to perform the necessary tasks. The software/firmware may include the actual code to carry out the operations described in one embodiment of the invention, or code that emulates or simulates the operations. The program or code segments can be stored in a processor or machine accessible medium or transmitted by a computer data signal embodied in a carrier wave, or a signal modulated by a carrier, over a transmission medium. The “processor readable or accessible medium” or “machine readable or accessible medium” may include any medium that can store, transmit, or transfer information. Examples of the processor readable or machine accessible medium include an electronic circuit, a semiconductor memory device, a read only memory (ROM), a flash memory, an erasable ROM (EROM), a floppy diskette, a compact disk (CD) ROM, an optical disk, a hard disk, a fiber optic medium, a radio frequency (RF) link, etc. The computer data signal may include any signal that can propagate over a transmission medium such as electronic network channels, optical fibers, air, electromagnetic, RF links, etc. The code segments may be downloaded via computer networks such as the Internet, intranet, etc. The machine accessible medium may be embodied in an article of manufacture. The machine accessible medium may include data that, when accessed by a machine, cause the machine to perform the operations described in the following. The machine accessible medium may also include program code embedded therein. The program code may include machine readable code to perform the operations described in the following. The term “data” here refers to any type of information that is encoded for machine-readable purposes. Therefore, it may include program, code, data, file, etc.
- All or part of an embodiment of the invention may be implemented by hardware, software, or firmware, or any combination thereof. The hardware, software, or firmware element may have several modules coupled to one another. A hardware module is coupled to another module by mechanical, electrical, optical, electromagnetic or any physical connections. A software module is coupled to another module by a function, procedure, method, subprogram, or subroutine call, a jump, a link, a parameter, variable, and argument passing, a function return, etc. A software module is coupled to another module to receive variables, parameters, arguments, pointers, etc. and/or to generate or pass results, updated variables, pointers, etc. A firmware module is coupled to another module by any combination of hardware and software coupling methods above. A hardware, software, or firmware module may be coupled to any one of another hardware, software, or firmware module. A module may also be a software driver or interface to interact with the operating system running on the platform. A module may also be a hardware driver to configure, set up, initialize, send and receive data to and from a hardware device. An apparatus may include any combination of hardware, software, and firmware modules.
- One embodiment of the invention may be described as a process which is usually depicted as a flowchart, a flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed. A process may correspond to a method, a program, a procedure, a method of manufacturing or fabrication, etc.
- In a standard computing platform, the I/O subsystem components function as part of a single domain and are managed by the operating-system software. One embodiment of the invention provides the hardware support required to assign I/O devices in a computing platform to multiple domains.
- A domain is abstractly defined as an isolated environment in the platform, to which a sub-set of the host-physical memory is allocated. The host-physical memory is included in the
system memory 140. I/O devices that are allowed to directly access the physical memory that is allocated to a domain are referred to as the domain's assigned devices. The isolation property of a domain is achieved by blocking access to its physical memory from resources not assigned to it. Multiple isolated domains are supported by ensuring all I/O devices are assigned to some domain (possibly a default domain), and by restricting access from each assigned device only to the physical memory allocated to its domain. Domains may share resources (e.g., memory, I/O devices) or be completely isolated from each other at the discretion of the software or other entity performing the partitioning. - Each domain has a view of physical memory, or a physical address space, that may be different than the system view of physical memory. An address used by a domain's resources to access its physical address space is referred to as a guest-physical address (GPA). The host-physical address (HPA) refers to the system physical address used to access memory. A domain is considered relocated if one or more of its GPAs must be translated to a new HPA which differs from the GPA to access its allocated system physical memory. A domain is referred to as non-relocated if all of its guest-physical addresses are the same as the host-physical addresses used to access its allocated system physical memory. Both relocated and non-relocated domains may be allocated a subset of the available system physical memory and may be prevented from accessing certain portions of the memory. Physical memory protection and partitioning requires a physical-address translation mechanism and a protection mechanism that can validate guest-physical addresses generated by a domain's assigned devices, including processors and I/O devices, and translate it to valid host-physical addresses. The
DMA remapping circuit 135 provides this support. - For assigning I/O devices to domains, physical-address translation and protection are applied for DMA requests from all I/O devices in the platform. For simplicity, the physical address translation functionality for I/O device DMA requests is referred to as DMA remapping. In discussions that follow, it should be understood that the term “remapping” also includes protection mechanisms in addition to the mapping of addresses from one address space to another (e.g., guest-physical addresses to host-physical addresses).
-
FIG. 2 is a diagram illustrating one embodiment of I/O device assignment. The I/O device assignment is a mapping of an I/O device to a domain in thesystem memory 140. The mapping is supported byDMA remapping circuit 135. As an example,device A 210 is mapped intodomain 1 240 in thesystem memory 140. Thedomain 1 may have twodrivers device A 210. -
DMA remapping circuit 135 includes aregister set 220, aDMA remapping structure 222, and alogic circuit 224. The register set 220 includes a number of registers that provides control or status information used by theDMA remapping structure 222, thelogic circuit 224, and the programs or drivers for the I/O devices. TheDMA remapping structure 222 provides the basic structure, storage, or tables used in the remapping or address translation of the guest-physical address to the host-physical address in an appropriate domain. Thelogic circuit 224 includes circuitry that performs the remapping or address translation operations and other interfacing functions. TheDMA remapping circuit 135 may have different implementations to support different configurations and to provide different capabilities for the remapping or address translation operations. - I/O device assignment and/or management using the
DMA remapping circuit 135 provides a number of usages or applications. Two useful applications are OS robustness applications and virtualization applications. - OS Robustness applications: Domain isolation has multiple uses for operating-system software. For example, an OS may define a domain containing its critical code and data structures in memory, and restrict access to this domain from all I/O devices in the system. This allows the OS to limit erroneous or unintended corruption of data and code through incorrect programming of devices by device drivers, or certain classes of device failures thereby improving its robustness. Alternatively, an OS may allow a subset of trusted devices to access critical code and data structures in memory but disallow access from other devices.
- In another usage, the OS may use domains to better manage DMA from legacy 32-bit PCI devices to high memory (above 4 GB). This is achieved by allocating 32-bit devices to one or more domains and programming the I/O-physical-address-translation mechanism to remap the DMA from these devices to high memory. Without such support, the software has to resort to data copying through OS bounce buffers.
- In a more involved usage, an OS may manage I/O by creating multiple domains and assigning one or more I/O devices to the individual domains. In this usage, the device drivers explicitly register their I/O buffers with the OS, and the OS assigns these I/O buffers to specific domains, using hardware to enforce the DMA domain protections. In this model, the OS uses the I/O address translation and protection mechanism as an I/O memory management unit (I/O MMU).
- Virtualization applications: The virtualization technology allows for the creation of one or more virtual machines (VMs) on a single system. Each VM may run simultaneously utilizing the underlying physical hardware resources. Virtual machines allow multiple operating system instances to run on the same processor offering benefits such as system consolidation, legacy migration, activity partitioning and security.
- Virtualization architectures typically involve two principal classes of software components: (a) Virtual machine monitors (VMMs) and (b) Virtual Machines (VMs). The VMM software layer runs at the highest privilege level and has complete ownership of the underlying system hardware. The VMM allows the VMs to share the underlying hardware and yet provides isolation between VMs.
- The limitations of software-only methods for I/O virtualization can be removed by direct assignment of I/O devices to VMs using
DMA remapping circuit 135. With direct assignment of devices, the driver for an assigned I/O device runs only in the VM to which it is assigned and is allowed to interact directly with the device hardware without trapping to the VMM. The hardware support enables DMA remapping without device specific knowledge in the VMM. - In this model, the VMM restricts itself to a controlling function where it explicitly does the set-up and tear-down of device assignment to VMs. Rather than trapping to the VMM for guest I/O accesses as in the case of software-only methods for I/O virtualization, the VMM requires the guest I/O access trapping only to protect specific resources such as device configuration space accesses, interrupt management etc., that impact system functionality.
- To support direct assignment of I/O devices to VMs, a VMM manages DMA from I/O devices. The VMM may map itself to a domain, and map each VM to an independent domain. The I/O devices can be assigned to domains, and the physical address translation hardware provided by the
DMA remapping circuit 135 may be used to allow the DMA from I/O devices only to the physical memory assigned to the assigned VM's domain. For VMs that may be relocated in physical memory (i.e., the GPA not identical to the HPA), theDMA remapping circuit 135 can be programmed to do the necessary GPA-to-HPA translation. - With hardware support for I/O device assignment, VMM implementations can choose a combination of software-only I/O virtualization methods and direct device assignment for presenting I/O device resources to a VM.
-
FIG. 3 is a diagram illustrating one embodiment of virtualization using DMA remapping. The virtualization includes two devices A andB DMA remapping circuit 135, a VMM or hostingOS 320,VM 0 340 andVM n 360. The two devices A andB VM DMA remapping circuit 135 directly maps these two devices to the respective VM's 340 and 360 without specific knowledge of the VMM or hostingOS 320. More or fewer I/O devices and VMs may be supported. - The VMM or the hosting
OS 320 provides support for the underlying hardware of the platform or the system on which it is executing.VMs OS 320 to access to the system hardware.VM 340 includesapplications guest OS 346 and adevice A driver 350. Thedevice A driver 350 is a driver that drives, controls, interfaces, or supports thedevice A 310. Similarly,VM 360 includesapplications guest OS 366 and adevice B driver 370. Theguest OS 366 may be the same or different than theguest OS 346 in theVM 340. Thedevice B driver 370 is a driver that drives, controls, interfaces, or supports thedevice B 312. - The DMA remapping architecture provided by the
DMA remapping circuit 135 facilitates the assigning of I/O devices to an arbitrary number of domains. Each domain has a physical address space that may be different than the system physical address space. The DMA remapping provides the transformation of guest-physical address (GPA) in DMA requests from an I/O device to the corresponding host-physical address (HPA) allocated to its domain. - To support this, the platform may support one or more I/O physical address translation hardware units. Each translation hardware unit supports remapping of the I/O transactions originating from within its hardware scope. For example, a desktop chipset implementation may expose a single DMA remapping hardware unit that translates all I/O transactions at the memory controller hub (MCH) component. A server platform with one or more core chipset components may support independent translation hardware units in each component, each translating DMA requests originating within its I/O hierarchy. The architecture supports configurations where these hardware units may share the same translation data structures in system memory or use independent structures depending on software programming.
- The chipset
DMA remapping circuit 135 treats the addresses in DMA requests as guest-physical addresses (GPA).DMA remapping circuit 135 may apply the address translation function to the incoming address to convert it to a host-physical address (HPA) before further hardware processing, such as snooping of processor caches or forwarding to the memory controller. - In a virtualization context, the address translation function implemented by
DMA remapping circuit 135 depends on the physical-memory management supported by the VMM. For example, in usages where the software does host-physical memory allocations as contiguous regions, the DMA translation for converting GPA to HPA may be a simple offset addition. In usages where the VMM manages physical memory at page granularity,DMA remapping circuit 135 may use a memory-resident address translation data structure. -
FIG. 4 is a diagram illustrating one embodiment of an I/O address translation. The I/O address translation includes two I/O devices DMA remapping circuit 135, aphysical memory 420, and aguest view 430. The I/O devices -
DMA remapping circuit 135 maps these two devices to corresponding domains allocated in thephysical memory 420. Thephysical memory 420 is partitioned intomemory segments memory segments FIG. 4 ,memory segments domain 1 442 and correspond todevice 1 410 andmemory segments domain 2 444 and correspond todevice 1 412. In the example illustrated inFIG. 4 ,device 1 410 is mapped to thedomain 1 422 and thedevice 2 412 is mapped or assigned to thedomain 2 428. - The
guest view 430 is a logical view from the guest I/O devices. It includesdomain 1 442 anddomain 2 444. Thedomain 1 442 corresponds to the twomemory segments physical memory 420. Thedomain 2 444 corresponds to the twomemory segments guest view 430 of physical memory. Each of the domains may be assigned to one or more I/O devices. The DMA_ADR address from thedevice 1 410 is mapped to the DMA_ADR1 located within the address space from 0 to L of thedomain 1 442. Similarly, the DMA_ADR address from thedevice 2 412 is mapped to the DMA_ADR2 located within the address space from 0 to K of thedomain 2 444. - The software responsible for the creation and management of the domains allocates the
physical memory 420 for both domains and sets up the GPA-to-HPA address translation function in theDMA remapping circuit 135. TheDMA remapping circuit 135 translates the GPAS generated by thedevices -
FIG. 5 is a diagram illustrating one embodiment of aDMA remapping structure 222.DMA remapping structure 222 receives asource identifier 510 and includes a root table 520, a number of context tables, of which two are shown 530 0 and 530 m, and a number of address translation structures, of which two are shown 540 0 and 540 m. Theremapping structure 222 receives thesource identifier 510 and a guest-physical address from the I/O device, and translates the guest-physical address in an assigned domain to a host-physical address. The translation may be performed using translation tables arranged in a hierarchical manner. The translation mechanism starts from the root table 520 and traverses, or walks, through the context tables (e.g., 530 0 and 530 m) and the address translation structures (e.g., 540 0 and 540 m). - The requester identity of the I/O transactions appearing at
DMA remapping circuit 135 determines the originating device and the domain that the originating I/O device is assigned to. Thesource identifier 510 is the attribute identifying the originator of an I/O transaction.DMA remapping circuit 135 may determine thesource identifier 510 of a transaction in implementation specific ways. For example, some I/O bus protocols may provide the originating device identity as part of each I/O transaction. In other cases, such as for chipset integrated devices, thesource identifier 510 may be implied based on the chipset's architecture or implementation. - For PCI Express devices,
source identifier 510 is mapped to the requestor identifier provided as part of the I/O transaction header. The requestor identifier of a device includes its PCI Bus/Device/Function numbers assigned by the configuration software and uniquely identifies the hardware function that initiates the I/O request. In one embodiment, thesource identifier 510 includes afunction number 512, adevice number 514, and abus number 516. In the example illustrated inFIG. 5 , thefunction number 512 is K-bit wide, thedevice number 514 is L-bit wide, and thebus number 516 is M-bit wide. Thebus number 516 identifies the bus on which the I/O transaction is generated. Thedevice number 514 identifies the specific device on the identified bus. Thefunction number 512 identifies the specific function of the I/O device. Thesource identifier 510 is used to index or look up the root table 520 and the context tables (e.g., 530 0 and 530 m). In the example illustrated inFIG. 5 , their paths through theDMA remapping structure 222 are illustrated for two I/Otransactions using bus 0 and bus m, respectively. - For PCI Express devices, the root table 520 stores root
entries 525 0 to 525 2ˆM−1 indexed by thesource identifier 510, or thebus number 516 of thesource identifier 510. The root entries function as the top level structure to map devices on a specific bus to its respective parent domain. Theroot entry 0 525 0 corresponds to the I/Otransaction using bus 0. Theroot entry m 525 m corresponds to the I/O transaction using bus m. Theroot entries 0 525 0 and 525 m point to the context tables 530 0 and 530 m, respectively. In one embodiment, these entries provide the base address for the corresponding context table. - The context tables 530 (e.g., 530 0 and 530 m) store context entries 535 (e.g., 535 0 and 535 m) referenced by the root entries. The context entries 535 map the I/O devices to their corresponding domain(s). The
device number 514 and thefunction number 512 are used to obtain the context entry corresponding to the I/O transaction. In one embodiment, they form an index to point to, or reference, the context table referenced by the corresponding root entry. There are 2M*2L*2K or 2M+L+K context entries in all context tables. In one embodiment, K=3, L=5, and M=8, resulting in a total of 64K entries, organized as 2M(28=256) context tables. In the example shown inFIG. 4 , the two context entries for the two I/O transactions are the context entry 535 0 in the context table 530 0 and the context entry 535 m in the context table 530 m. The context entries 535 0 and 535 m point to the address translation structures 540 0 and 540 m, respectively. - The address translation structures 540 (e.g., 540 0 and 540 m) provide the address translation to the host-physical address using the guest-physical address corresponding to the I/O transaction. Each of the address translation structures may be a multi-table 550, a single table 560, or a base/bound 570 corresponding to the three translation mechanisms using multi tables, single table, and base/bound translations, respectively. In the following description, a regular page size of 4 KB is used. As is known by one skilled in the art, any other sizes may also be used.
- To provide software flexible control of
DMA remapping circuit 135,DMA remapping circuit 135 has a number of registers included in register set 220 shown inFIG. 2 . Register set 220 is located in the host-physical address space through a Base Address Register (BAR). The translation hardware BAR is exposed to software in an implementation dependent manner. This may be exposed as a PCI configuration space register in one of the chipset integrated devices, such as the memory controller device. In one embodiment, the BAR provides a minimum of 4K address window. A register in the register set 220 may have a number of fields. A field may be asserted or negated. When a field consists of only a single bit, assertion implies that the bit is set to a defined logical state (e.g., TRUE, logical one) and negation implies that the bit is reset to a defined logic state that is complementary to the state of the assertion (e.g., FALSE, logical zero). In the following, the use of an asserted or negated state is arbitrary. A field may be asserted to indicate a first state and negated to indicate a second state, or vice versa. - A field in a register may be programmed, initialized, or configured by
DMA remapping circuit 135 and/or by the software. It may also correspond to a specialized hardware circuit or a functionality implemented by a data structure, a function, a routine, or a method. In the following, fields are grouped into registers. The grouping, formatting, or organization of these fields or bits in the following registers is for illustrative purposes. Other ways of grouping, formatting, or organizing these fields may be used. A field may also be duplicated in more than one registers. A register may have more or fewer than the fields as described. In addition, registers may be implemented in a number of ways, including as storage elements or memory elements. - The DMA remapping architecture described above includes DMA that is translated using single or multiple level page tables (TLBs), as shown in
FIG. 5 . Such an architecture is suitable for legacy software usages (e.g., where the OS or VMM doesn't know about driver DMA usages). Further, single or multiple level page table translations may offer good-to-average DMA performance for most I/O devices (as measured by DMA throughput). However, such a system has limitations. - One limitation is that the worst case latency introduced by multiple sequential memory accesses for the page-walk on TLB misses is prohibitive for I/O devices whose performance depends on guaranteed worst case (isochronous) DMA performance. Examples of these types of devices include PCI Express devices supporting isochronous DMA (such as a high performance audio controller), display engines of graphics devices, and USB controller devices.
- Another limitation is that for non-legacy software usages (e.g., newer OSs and VMMs) that may know more about driver DMA usages, the current architecture does not provide any means for software to provide DMA usage hints to improve DMA-remapping performance.
- Finally, the memory access latencies for page-walks increase as, for example, platform configurations move to memory controllers implemented within the processor complex.
- Address Window Based DMA Address Translation
- Based on the above-described limitations of DMA remapping architecture,
DMA remapping circuit 135 is configured to support address window-based address translation in addition to the single and multi-level page-table based address translation. Thus, eachDMA remapping circuit 135 may support a number of address windows, with the exact number of address windows supported being a function of hardware implementation. In one embodiment, the system firmware assigns an address window (AW) range (start and end AW numbers) for eachDMA remapping circuit 135. - In a further embodiment, the chipset supports an additional caching structure in addition to existing
remapping circuit 135 caching structures. These additional structures are referred to as AWPTR tables implemented for address window translations. AWPTR tables will be discussed below in greater detail. - According to one embodiment, a device-physical address (DPA) refers to a target address specified by I/O devices in its DMA requests. In one embodiment, the DPA address space spans across all I/O devices in the computer system and is sub-divided into multiple AWs. In such an embodiment, each AW covers a contiguous 2 MB region of DPA space. For example, an AW0 may cover
DPA 0 to 2 MB, an AW1 may coverDPA 2 MB to 4 MB, etc. Given any DPA, the associated AW number is determined by examining bits in the DPA (e.g., AW#:=DPA[63:21]). - In a further embodiment, each AW is described by a DPA-to-HPA translation structure in memory called an Address Window page-table (AWPT). The entries in an AWPT are called AW page-table-entries (AWPTE). Each AWPTE provides the translation for a 4 KB region (referred as a slot) within the AW. Thus, there are 512 slots in an AW, and the AWPT associated with each AW is 4 KB in size (with 512 AWPTEs). According to one embodiment, AWPTEs are 64-bits in size and have the format illustrated in
FIG. 6 . In one embodiment, an AWPTE includes access control bits such as bits (“read” and “write”) specifying if read accesses and/or write accesses are allowed to the DPA used to access the AWPTE. In one embodiment, the address field (“ADDR”) specifies the mapping of a subset of the bits in the DPA to HPA. In one embodiment, the remaining bits (e.g., bits [11:0]) may be passed unmodified from the DPA to the HPA. Many other configurations are possible and do not limit the scope of the invention. - The system software can bind one or more AWs to specific I/O devices. For this, the driver registers its DMA buffers (e.g., in the host-physical address space) with the AW bound to the device to generate a DPA mapping. The driver for a device identifies its target buffers to the device hardware using its DPA. Addresses in DMA requests from the device are DPAs that are translated by remapping
circuit 135 based on the AW bound to the device and the DPA-to-HPA translations for the address window. -
FIG. 7 illustrates one embodiment of a process for translating DMA addresses in the presence of address windows.FIG. 7 illustrates the use of AWPTs and AWPTR tables to provide the translation. In one embodiment, the AWPTR table inremapping circuit 135 has as many entries as the number of address windows supported by itsremapping circuit 135. Entries in an AWPTR table are associated with a specific AW configured on itsremapping circuit 135. - Further, each AWPTR table entry includes the HPA to the base of the AWPT for the particular AW. Each AWPTR table entry is tagged with the device-id of the I/O device to which the associated AW is allocated. In one embodiment, the AWPTR table structure is memory-mapped to allow software to modify entries in it. The base address of the AWPTR table is referred as AWPTR_TABLE_BASE. AWPTR table entries are called AWPTRs, and a specific entry at a particular index in the cache is notated as AWPTR[index].
-
FIG. 8 illustrates one embodiment of an AWPTR table structure for a chipset implementing tworemapping circuits 135, with each configured to support two AWs (4 to 5, and 6 to 7, respectively). According to one embodiment, although the AWPTR table is stored in registers, the table appears to software to reside at an address specified by AWPTR_TABLE_BASE. - Further, in an embodiment, each entry in the AWPTR table includes Valid, Tag and Data fields. The Valid field indicates whether an entry is valid. In another embodiment, there is no valid bit and the
remapping circuit 135 treats all AWPTR table entries as being valid. The Tag field indicates the particular device ID to which the entry is associated. For example, inFIG. 8 software has boundAW 4 to an I/O device withdevice ID 11 and AW 6 to an I/O device withdevice ID 18. In one embodiment, identification of a device originating an access (e.g., determination of device ID) may include information on the bus, device and function within the device. The originator of a DMA request is referred to herein as a “device” or “requesting I/O device” and is identified by a “device ID”. However, it should be understood that in other embodiments a single physical device may be identified by one or more device-IDs. - According to a further embodiment, the
remapping circuit 135 supports a set of 16-bit memory mapped registers, called AW_FLUSH registers.FIG. 9 illustrates one embodiment of AW_FLUSH flush registers. The AW_FLUSH flush registers are implemented as a mechanism for software to invalidate translations which may be cached by one or more elements of theremapping circuit 135. In one embodiment, one AW_FLUSH flush register is implemented for each AW supported by aremapping circuit 135. - In one embodiment, the base address of this memory-mapped register range (AW_FLUSH_BASE) is initialized by platform firmware. For example, as illustrated in
FIG. 9 , if a chipset component supports tworemapping circuits 135 and theremapping circuits 135support AWs 4 to 5 and 6 to 7, respectively, the chipset supports a total of 4 AW_FLUSH registers.FIG. 10 illustrates one embodiment of a format for each AW_FLUSH registers. - Referring back to
FIG. 7 , address window based translation occurs if specified in the context entry for an I/O device. In one embodiment, the context entries may be cached by theremapping circuit 135, eliminating the need to access memory to make this determination. In a further embodiment, the context entry caching structure is directly accessible by software, allowing software to pre-populate the cache to reduce latencies for the first access to a context entry. In this way, software can guarantee that the worst-case memory access behavior for particular devices is limited to a single memory access, as described below. - If address window based translation is specified,
remapping circuit 135 checks to determine if the AW to which the DPA in the DMA request belongs is one of the AWs bound to the specified device. According to one embodiment,remapping circuit 135 performs this check by first finding the address window number (AW#) corresponding to the DPA in the DMA request (e.g., computed by AW#=DPA[(HAW−1):21], where HAW is the supported physical address width of the system). - Subsequently, it is determined if the AW# is allocated to the
remapping circuit 135 translating the DMA request. If AW# is not allocated to theremapping circuit 135, a translation fault occurs. In one embodiment, a translation fault may generate an interrupt to the processor. In another embodiment, software managing theremapping circuit 135 is responsible for periodically polling theremapping circuit 135 to determine if any translation faults have occurred. If AW# is valid, the associated AW table entry index is found (computed by INDEX=AW#−START_AW). Next, the AW table entry at AWPTR[INDEX] is accessed, and it is determined whether it is tagged with the device-id in the DMA request. If the check succeeds, the AWPTR value indicates the base of the AW page-table. The value in DPA[21:12] field is used to fetch the appropriate AWPTE in the AW page-table. - Based on the programming of the AWPTE, the DMA request is either completed to the HPA specified in the AWPTE, or it is blocked. The AWPTE processing is similar to how the leaf PTEs are processed in the
remapping circuit 135 for the multi-level I/O page-tables described. If the translation succeeds, in one embodiment, it is cached by theremapping circuit 135 in an I/O translation-lookaside buffer (I/O TLB). - As discussed above, remapping
circuits 135 performs address window based translation in addition to single and multi-level page-table based address translation.FIG. 11 is a flow diagram illustrating one embodiment of the operation of aremapping circuit 135 performing address window based translation, and single and multi-level page-table based address translation. - At
processing block 1110, an I/O device generates a DMA request. Atprocessing block 1120, the DMA-request is processed conventionally via aremapping circuit 135. Atdecision block 1130, it is determined whether a translation for the address specified in the DMA request (e.g., tagged with the device-id in the transaction) is found in the I/O TLB. If the translation for the address specified in the DMA request is found in the I/O TLB, the translation is completed without any memory access,processing block 1140. This includes DMA that may be translated using single- or multi-level page-tables or through address windows. - If the translation for the address specified in the DMA request is not found in the I/O TLB (e.g., miss detected), the context-cache is looked up to determine the translation behavior for the device,
processing block 1150. Atdecision block 1160, it is determined whether the translation is to be blocked or processed through single-level or multi-level page-tables. If the translation is to be blocked or processed through single-level or multi-level page-tables, the request is processed as described in theconventional remapping circuit 135 architecture described above with respect toFIGS. 1-5 , processing block 1170. However, if the translation is not to be blocked or processed through single-level or multi-level page-tables, the context-entry for the device specifies address window based translation. Consequently, address window based translation is performed as discussed above with respect toFIG. 7 ,processing block 1180. -
FIG. 12 illustrates another embodiment ofcomputer system 100. In this embodiment, the chipset includes asingle control hub 1230 as opposed to a separate MCH and ICH. In addition, memory control is located inprocessor 110. Consequently,system memory 140 is coupled toprocessor 110. In one embodiment, theremapping circuit 135 is included in thecontroller hub 1230. In another embodiment,remapping circuit 135 is included inprocessor 110 or in thesystem memory 140. - The above described remapping architecture enables 4K granular DMA address translations similar to multi-level page-tables, and yet offers a worst case performance guarantee which is limited to the overheads associated with a single memory lookup.
- Whereas many alterations and modifications of the present invention will no doubt become apparent to a person of ordinary skill in the art after having read the foregoing description, it is to be understood that any particular embodiment shown and described by way of illustration is in no way intended to be considered limiting. Therefore, references to details of various embodiments are not intended to limit the scope of the claims, which in themselves recite only those features regarded as the invention.
Claims (27)
1. An apparatus comprising a remapping circuit to facilitate access of one or more input/output (I/O) devices to a memory device using direct memory access (DMA) transactions, the remapping circuit including a first translation mechanism to perform memory address translations for I/O DMA transactions via address window-based translations.
2. The apparatus of claim 1 further comprising a second translation mechanism to perform memory address translations for I/O DMA transactions via at least one of single-level page tables and multi-level page tables.
3. The apparatus of claim 1 wherein the first translation mechanism includes an address window pointer table (AWPTR) to perform the address window-based translations.
4. The apparatus of claim 3 wherein the AWPTR comprises at least one entry including a base address of an address window page table (AWPT) for at least one address window (AW).
5. The apparatus of claim 4 wherein each AWPTR entry is tagged with a device ID indicating an I/O device to which an associated AW is allocated.
6. The apparatus of claim 5 wherein the device ID further includes information indicating at least one of a bus, a device, and a function within the device.
7. The apparatus of claim 4 wherein each AWPT entry provides a translation for a 4 KB slot within the AW.
8. The apparatus of claim 4 wherein each AWPT entry includes access control bits specifying if read accesses or write accesses are allowed to a device-physical address used to access the AWPT entry.
9. A method comprising:
receiving a direct memory access (DMA) request at a remapping circuit from a requesting input/output (I/O) device;
determining if the DMA request is permitted to complete; and
translating a device-physical address (DPA) to a host-physical address (HPA) in memory if the access is permitted.
10. The method of claim 9 wherein determining if the DMA request is permitted to complete comprises:
calculating a requested address window (AW) associated with the DPA;
determining if the requested AW is bound to the remapping circuit; and
determining if the requested AW is bound to the requesting I/O device.
11. The method of claim 10 wherein a translation fault occurs if it is determined that the requested AW is not bound to the requesting I/O device.
12. The method of claim 10 wherein the translation fault occurs if it is determined that the requested AW is not bound to the remapping circuit.
13. The method of claim 9 further comprising:
finding an associated AW pointer table entry index for the DPA; and
looking up the AW pointer table entry at the index.
14. The method of claim 13 further comprising determining whether the AW pointer table entry is tagged with a device ID corresponding to the requesting I/O device.
15. The method of claim 13 further comprising accessing an AW page table entry (AWPTE) in memory associated with the AW pointer table entry and the DPA.
16. The method of claim 15 further comprising calculating the HPA associated with the DPA using the AWPTE.
17. The method of claim 16 further comprising:
determining if the DMA request is allowed to complete based on at least one permission bit in the AWPTE and a type of the DMA request; and
preventing the completion of the DMA request if at the least one permission bit does not allow the type of the DMA request.
18. The method of claim 9 further comprising caching the completed translation.
19. A computer system comprising:
a main memory device;
one or more input/output (I/O) devices to access the memory device via direct memory access (DMA); and
a memory controller, coupled to the memory device, having a DMA remapping circuit to facilitate the access of the one or more I/O devices to the memory device, the DMA remapping circuit comprising:
a first translation mechanism to perform memory address translations for I/O DMA transactions via address window-based translations.
20. The computer system of claim 19 further comprising a second translation mechanism to perform memory address translations for I/O DMA transactions via at least one of single-level page tables and multi-level page tables.
21. The computer system of claim 19 wherein the memory device is subdivided into at least one address windows (AWs).
22. The computer system of claim 21 wherein the memory device further comprises an AW page table (AWPT) that defines a device-physical address (DPA) to host-physical address (HPA) translation.
23. The computer system of claim 22 wherein the AWPT comprises at least one AW page table entry (AWPTE), said AWPTE providing a translation for at least one address within the AW.
24. The computer system of claim 21 wherein each of the at least one AWs are bound to an I/O device.
25. The computer system of claim 22 wherein the first translation mechanism includes a table (AWPTR) to perform the address window-based translations.
26. The computer system of claim 22 wherein the AWPTR comprises at least one entry, said entry including a base of the AWPT for a particular AW.
27. The computer system of claim 26 wherein each AWPTR entry is tagged with a device ID indicating an I/O device to which an associated AW is allocated.
Priority Applications (8)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/157,675 US20060288130A1 (en) | 2005-06-21 | 2005-06-21 | Address window support for direct memory access translation |
KR1020077029979A KR101060395B1 (en) | 2005-06-21 | 2006-06-20 | Address Window Support for Direct Memory Access Translation |
TW095122090A TWI363967B (en) | 2005-06-21 | 2006-06-20 | A computer hardware apparatus to utilize address window support for direct memory access translation |
GB0722953A GB2441084A (en) | 2005-06-21 | 2006-06-20 | Address window support for direct money access translation |
CN2006800221864A CN101203838B (en) | 2005-06-21 | 2006-06-20 | Address window support for direct memory access translation |
PCT/US2006/024515 WO2007002425A1 (en) | 2005-06-21 | 2006-06-20 | Address window support for direct memory access translation |
DE112006001642T DE112006001642T5 (en) | 2005-06-21 | 2006-06-20 | Address window support for direct memory access translation |
US12/648,461 US7984203B2 (en) | 2005-06-21 | 2009-12-29 | Address window support for direct memory access translation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/157,675 US20060288130A1 (en) | 2005-06-21 | 2005-06-21 | Address window support for direct memory access translation |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/648,461 Continuation US7984203B2 (en) | 2005-06-21 | 2009-12-29 | Address window support for direct memory access translation |
Publications (1)
Publication Number | Publication Date |
---|---|
US20060288130A1 true US20060288130A1 (en) | 2006-12-21 |
Family
ID=36992652
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/157,675 Abandoned US20060288130A1 (en) | 2005-06-21 | 2005-06-21 | Address window support for direct memory access translation |
US12/648,461 Expired - Fee Related US7984203B2 (en) | 2005-06-21 | 2009-12-29 | Address window support for direct memory access translation |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/648,461 Expired - Fee Related US7984203B2 (en) | 2005-06-21 | 2009-12-29 | Address window support for direct memory access translation |
Country Status (7)
Country | Link |
---|---|
US (2) | US20060288130A1 (en) |
KR (1) | KR101060395B1 (en) |
CN (1) | CN101203838B (en) |
DE (1) | DE112006001642T5 (en) |
GB (1) | GB2441084A (en) |
TW (1) | TWI363967B (en) |
WO (1) | WO2007002425A1 (en) |
Cited By (54)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070214339A1 (en) * | 2006-03-10 | 2007-09-13 | Microsoft Corporation | Selective address translation for a resource such as a hardware device |
US20070226450A1 (en) * | 2006-02-07 | 2007-09-27 | International Business Machines Corporation | Method and system for unifying memory access for CPU and IO operations |
US20070233455A1 (en) * | 2006-03-28 | 2007-10-04 | Zimmer Vincent J | Techniques for unified management communication for virtualization systems |
US20080104320A1 (en) * | 2006-10-26 | 2008-05-01 | Via Technologies, Inc. | Chipset and northbridge with raid access |
US20080114916A1 (en) * | 2006-11-13 | 2008-05-15 | Hummel Mark D | Filtering and Remapping Interrupts |
US20080114906A1 (en) * | 2006-11-13 | 2008-05-15 | Hummel Mark D | Efficiently Controlling Special Memory Mapped System Accesses |
US20090037614A1 (en) * | 2007-07-31 | 2009-02-05 | Ramakrishna Saripalli | Offloading input/output (I/O) virtualization operations to a processor |
US20090235249A1 (en) * | 2008-03-11 | 2009-09-17 | Yuji Kobayashi | Virtual computer system and method of controlling the same |
US7685371B1 (en) * | 2006-04-19 | 2010-03-23 | Nvidia Corporation | Hierarchical flush barrier mechanism with deadlock avoidance |
US20100100649A1 (en) * | 2004-12-29 | 2010-04-22 | Rajesh Madukkarumukumana | Direct memory access (DMA) address translation between peer input/output (I/O) devices |
US20100169673A1 (en) * | 2008-12-31 | 2010-07-01 | Ramakrishna Saripalli | Efficient remapping engine utilization |
US7756943B1 (en) * | 2006-01-26 | 2010-07-13 | Symantec Operating Corporation | Efficient data transfer between computers in a virtual NUMA system using RDMA |
US20100228943A1 (en) * | 2009-03-04 | 2010-09-09 | Freescale Semiconductor, Inc. | Access management technique for storage-efficient mapping between identifier domains |
US20100228945A1 (en) * | 2009-03-04 | 2010-09-09 | Freescale Semiconductor, Inc. | Access management technique with operation translation capability |
US20110126265A1 (en) * | 2007-02-09 | 2011-05-26 | Fullerton Mark N | Security for codes running in non-trusted domains in a processor core |
WO2011160708A1 (en) | 2010-06-23 | 2011-12-29 | International Business Machines Corporation | Multiple address spaces per adapter |
US20110320638A1 (en) * | 2010-06-23 | 2011-12-29 | International Business Machines Corporation | Enable/disable adapters of a computing environment |
US20120151471A1 (en) * | 2010-12-08 | 2012-06-14 | International Business Machines Corporation | Address translation table to enable access to virtualized functions |
US8417911B2 (en) | 2010-06-23 | 2013-04-09 | International Business Machines Corporation | Associating input/output device requests with memory associated with a logical partition |
US8416834B2 (en) | 2010-06-23 | 2013-04-09 | International Business Machines Corporation | Spread spectrum wireless communication code for data center environments |
US8458387B2 (en) | 2010-06-23 | 2013-06-04 | International Business Machines Corporation | Converting a message signaled interruption into an I/O adapter event notification to a guest operating system |
US8478922B2 (en) | 2010-06-23 | 2013-07-02 | International Business Machines Corporation | Controlling a rate at which adapter interruption requests are processed |
US8505032B2 (en) | 2010-06-23 | 2013-08-06 | International Business Machines Corporation | Operating system notification of actions to be taken responsive to adapter events |
US8504754B2 (en) | 2010-06-23 | 2013-08-06 | International Business Machines Corporation | Identification of types of sources of adapter interruptions |
US8510599B2 (en) | 2010-06-23 | 2013-08-13 | International Business Machines Corporation | Managing processing associated with hardware events |
US8549182B2 (en) | 2010-06-23 | 2013-10-01 | International Business Machines Corporation | Store/store block instructions for communicating with adapters |
US8566480B2 (en) | 2010-06-23 | 2013-10-22 | International Business Machines Corporation | Load instruction for communicating with adapters |
US8572635B2 (en) | 2010-06-23 | 2013-10-29 | International Business Machines Corporation | Converting a message signaled interruption into an I/O adapter event notification |
US8615622B2 (en) | 2010-06-23 | 2013-12-24 | International Business Machines Corporation | Non-standard I/O adapters in a standardized I/O architecture |
US8615645B2 (en) | 2010-06-23 | 2013-12-24 | International Business Machines Corporation | Controlling the selectively setting of operational parameters for an adapter |
US8621112B2 (en) | 2010-06-23 | 2013-12-31 | International Business Machines Corporation | Discovery by operating system of information relating to adapter functions accessible to the operating system |
US8626970B2 (en) | 2010-06-23 | 2014-01-07 | International Business Machines Corporation | Controlling access by a configuration to an adapter function |
US8631212B2 (en) | 2011-09-25 | 2014-01-14 | Advanced Micro Devices, Inc. | Input/output memory management unit with protection mode for preventing memory access by I/O devices |
US8631222B2 (en) | 2010-06-23 | 2014-01-14 | International Business Machines Corporation | Translation of input/output addresses to memory addresses |
US8639858B2 (en) * | 2010-06-23 | 2014-01-28 | International Business Machines Corporation | Resizing address spaces concurrent to accessing the address spaces |
US8645767B2 (en) | 2010-06-23 | 2014-02-04 | International Business Machines Corporation | Scalable I/O adapter function level error detection, isolation, and reporting |
US8645606B2 (en) | 2010-06-23 | 2014-02-04 | International Business Machines Corporation | Upbound input/output expansion request and response processing in a PCIe architecture |
US8650337B2 (en) | 2010-06-23 | 2014-02-11 | International Business Machines Corporation | Runtime determination of translation formats for adapter functions |
US8650335B2 (en) | 2010-06-23 | 2014-02-11 | International Business Machines Corporation | Measurement facility for adapter functions |
US8656228B2 (en) | 2010-06-23 | 2014-02-18 | International Business Machines Corporation | Memory error isolation and recovery in a multiprocessor computer system |
US8671287B2 (en) | 2010-06-23 | 2014-03-11 | International Business Machines Corporation | Redundant power supply configuration for a data center |
US8677180B2 (en) | 2010-06-23 | 2014-03-18 | International Business Machines Corporation | Switch failover control in a multiprocessor computer system |
US8683108B2 (en) | 2010-06-23 | 2014-03-25 | International Business Machines Corporation | Connected input/output hub management |
US20140089631A1 (en) * | 2012-09-25 | 2014-03-27 | International Business Machines Corporation | Power savings via dynamic page type selection |
US20140089621A1 (en) * | 2012-09-21 | 2014-03-27 | International Business Machines Corporation | Input/output traffic backpressure prediction |
US8745292B2 (en) | 2010-06-23 | 2014-06-03 | International Business Machines Corporation | System and method for routing I/O expansion requests and responses in a PCIE architecture |
US8918657B2 (en) | 2008-09-08 | 2014-12-23 | Virginia Tech Intellectual Properties | Systems, devices, and/or methods for managing energy usage |
US8918573B2 (en) | 2010-06-23 | 2014-12-23 | International Business Machines Corporation | Input/output (I/O) expansion response processing in a peripheral component interconnect express (PCIe) environment |
US9342352B2 (en) | 2010-06-23 | 2016-05-17 | International Business Machines Corporation | Guest access to address spaces of adapter |
US20170277530A1 (en) * | 2016-03-24 | 2017-09-28 | Intel Corporation | Technologies for securing a firmware update |
DE102007062744B4 (en) | 2006-12-27 | 2018-09-06 | Intel Corporation | Guest-to-host address translation for accessing devices on storage in a partitioned system |
US10394711B2 (en) * | 2016-11-30 | 2019-08-27 | International Business Machines Corporation | Managing lowest point of coherency (LPC) memory using a service layer adapter |
CN117331861A (en) * | 2023-11-28 | 2024-01-02 | 珠海星云智联科技有限公司 | Direct memory mapping method, device, equipment, cluster and medium |
US12072813B2 (en) * | 2021-10-22 | 2024-08-27 | Shanghai Zhaoxin Semiconductor Co., Ltd. | Method for remapping virtual address to physical address and address remapping unit |
Families Citing this family (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7707383B2 (en) | 2006-11-21 | 2010-04-27 | Intel Corporation | Address translation performance in virtualized environments |
US8161243B1 (en) | 2007-09-28 | 2012-04-17 | Intel Corporation | Address translation caching and I/O cache performance improvement in virtualized environments |
US8307180B2 (en) | 2008-02-28 | 2012-11-06 | Nokia Corporation | Extended utilization area for a memory device |
US8874824B2 (en) | 2009-06-04 | 2014-10-28 | Memory Technologies, LLC | Apparatus and method to share host system RAM with mass storage memory RAM |
US9535849B2 (en) * | 2009-07-24 | 2017-01-03 | Advanced Micro Devices, Inc. | IOMMU using two-level address translation for I/O and computation offload devices on a peripheral interconnect |
US8392628B2 (en) * | 2010-07-16 | 2013-03-05 | Hewlett-Packard Development Company, L.P. | Sharing memory spaces for access by hardware and software in a virtual machine environment |
US20120036302A1 (en) * | 2010-08-04 | 2012-02-09 | International Business Machines Corporation | Determination of one or more partitionable endpoints affected by an i/o message |
WO2013048943A1 (en) | 2011-09-30 | 2013-04-04 | Intel Corporation | Active state power management (aspm) to reduce power consumption by pci express components |
US8881145B2 (en) * | 2011-12-15 | 2014-11-04 | Industrial Technology Research Institute | System and method for generating application-level dependencies in one or more virtual machines |
US9417998B2 (en) * | 2012-01-26 | 2016-08-16 | Memory Technologies Llc | Apparatus and method to provide cache move with non-volatile mass memory system |
US9311226B2 (en) | 2012-04-20 | 2016-04-12 | Memory Technologies Llc | Managing operational state data of a memory module using host memory in association with state change |
US9256531B2 (en) | 2012-06-19 | 2016-02-09 | Samsung Electronics Co., Ltd. | Memory system and SoC including linear addresss remapping logic |
US9164804B2 (en) | 2012-06-20 | 2015-10-20 | Memory Technologies Llc | Virtual memory module |
US9116820B2 (en) | 2012-08-28 | 2015-08-25 | Memory Technologies Llc | Dynamic central cache memory |
CN104021127A (en) * | 2013-03-01 | 2014-09-03 | 联想(北京)有限公司 | Information processing method and electronic device |
US9754870B2 (en) * | 2013-07-10 | 2017-09-05 | Kinsus Interconnect Technology Corp. | Compound carrier board structure of flip-chip chip-scale package and manufacturing method thereof |
US9645934B2 (en) | 2013-09-13 | 2017-05-09 | Samsung Electronics Co., Ltd. | System-on-chip and address translation method thereof using a translation lookaside buffer and a prefetch buffer |
US9798567B2 (en) | 2014-11-25 | 2017-10-24 | The Research Foundation For The State University Of New York | Multi-hypervisor virtual machines |
US9563572B2 (en) | 2014-12-10 | 2017-02-07 | International Business Machines Corporation | Migrating buffer for direct memory access in a computer system |
JP6763307B2 (en) * | 2015-01-16 | 2020-09-30 | 日本電気株式会社 | Calculator, device control system and device control method |
US9720838B2 (en) | 2015-03-27 | 2017-08-01 | Intel Corporation | Shared buffered memory routing |
US9824015B2 (en) * | 2015-05-29 | 2017-11-21 | Qualcomm Incorporated | Providing memory management unit (MMU) partitioned translation caches, and related apparatuses, methods, and computer-readable media |
US10120709B2 (en) * | 2016-02-29 | 2018-11-06 | Red Hat Israel, Ltd. | Guest initiated atomic instructions for shared memory page host copy on write |
US10095620B2 (en) | 2016-06-29 | 2018-10-09 | International Business Machines Corporation | Computer system including synchronous input/output and hardware assisted purge of address translation cache entries of synchronous input/output transactions |
US11200183B2 (en) * | 2017-03-31 | 2021-12-14 | Intel Corporation | Scalable interrupt virtualization for input/output devices |
US11016798B2 (en) | 2018-06-01 | 2021-05-25 | The Research Foundation for the State University | Multi-hypervisor virtual machines that run on multiple co-located hypervisors |
CN110941565B (en) * | 2018-09-25 | 2022-04-15 | 北京算能科技有限公司 | Memory management method and device for chip storage access |
CN109947671B (en) * | 2019-03-05 | 2021-12-03 | 龙芯中科技术股份有限公司 | Address translation method and device, electronic equipment and storage medium |
Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5890220A (en) * | 1991-02-05 | 1999-03-30 | Hitachi, Ltd. | Address conversion apparatus accessible to both I/O devices and processor and having a reduced number of index buffers |
US6665759B2 (en) * | 2001-03-01 | 2003-12-16 | International Business Machines Corporation | Method and apparatus to implement logical partitioning of PCI I/O slots |
US6725284B2 (en) * | 2002-04-25 | 2004-04-20 | International Business Machines Corporation | Logical partition hosted virtual input/output using shared translation control entries |
US6804741B2 (en) * | 2002-01-16 | 2004-10-12 | Hewlett-Packard Development Company, L.P. | Coherent memory mapping tables for host I/O bridge |
US6820207B2 (en) * | 2001-03-01 | 2004-11-16 | International Business Machines Corporation | Method for rebooting only a specific logical partition in a data processing system as per a request for reboot |
US6941436B2 (en) * | 2002-05-09 | 2005-09-06 | International Business Machines Corporation | Method and apparatus for managing memory blocks in a logical partitioned data processing system |
US6986006B2 (en) * | 2002-04-17 | 2006-01-10 | Microsoft Corporation | Page granular curtained memory via mapping control |
US20060069899A1 (en) * | 2004-09-30 | 2006-03-30 | Ioannis Schoinas | Performance enhancement of address translation using translation tables covering large address spaces |
US7058768B2 (en) * | 2002-04-17 | 2006-06-06 | Microsoft Corporation | Memory isolation through address translation data edit control |
US7069413B1 (en) * | 2003-01-29 | 2006-06-27 | Vmware, Inc. | Method and system for performing virtual to physical address translations in a virtual machine monitor |
US20060206687A1 (en) * | 2005-03-08 | 2006-09-14 | Microsoft Corporation | Method and system for a second level address translation in a virtual machine environment |
US20060206658A1 (en) * | 2005-03-08 | 2006-09-14 | Microsoft Corporation | Method and system for a guest physical address virtualization in a virtual machine environment |
US7117385B2 (en) * | 2003-04-21 | 2006-10-03 | International Business Machines Corporation | Method and apparatus for recovery of partitions in a logical partitioned data processing system |
US7225287B2 (en) * | 2005-06-01 | 2007-05-29 | Microsoft Corporation | Scalable DMA remapping on a computer bus |
US7308551B2 (en) * | 2005-02-25 | 2007-12-11 | International Business Machines Corporation | System and method for managing metrics table per virtual port in a logically partitioned data processing system |
US7353360B1 (en) * | 2005-04-05 | 2008-04-01 | Sun Microsystems, Inc. | Method for maximizing page locality |
US7467381B2 (en) * | 2003-12-16 | 2008-12-16 | Intel Corporation | Resource partitioning and direct access utilizing hardware support for virtualization |
Family Cites Families (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4891752A (en) * | 1987-03-03 | 1990-01-02 | Tandon Corporation | Multimode expanded memory space addressing system using independently generated DMA channel selection and DMA page address signals |
US5522075A (en) | 1991-06-28 | 1996-05-28 | Digital Equipment Corporation | Protection ring extension for computers having distinct virtual machine monitor and virtual machine address spaces |
JP3264319B2 (en) * | 1997-06-30 | 2002-03-11 | 日本電気株式会社 | Bus bridge |
US20020108025A1 (en) | 1998-10-21 | 2002-08-08 | Nicholas Shaylor | Memory management unit for java environment computers |
US7114056B2 (en) * | 1998-12-03 | 2006-09-26 | Sun Microsystems, Inc. | Local and global register partitioning in a VLIW processor |
US7117342B2 (en) * | 1998-12-03 | 2006-10-03 | Sun Microsystems, Inc. | Implicitly derived register specifiers in a processor |
US6339803B1 (en) * | 1999-02-19 | 2002-01-15 | International Business Machines Corporation | Computer program product used for exchange and transfer of data having a queuing mechanism and utilizing a queued direct input-output device |
US6549959B1 (en) * | 1999-08-30 | 2003-04-15 | Ati International Srl | Detecting modification to computer memory by a DMA device |
US20020103889A1 (en) | 2000-02-11 | 2002-08-01 | Thomas Markson | Virtual storage layer approach for dynamically associating computer storage with processing hosts |
US6678825B1 (en) * | 2000-03-31 | 2004-01-13 | Intel Corporation | Controlling access to multiple isolated memories in an isolated execution environment |
US6907600B2 (en) * | 2000-12-27 | 2005-06-14 | Intel Corporation | Virtual translation lookaside buffer |
US6839892B2 (en) | 2001-07-12 | 2005-01-04 | International Business Machines Corporation | Operating system debugger extensions for hypervisor debugging |
US7158972B2 (en) | 2001-12-11 | 2007-01-02 | Sun Microsystems, Inc. | Methods and apparatus for managing multiple user systems |
US7089377B1 (en) | 2002-09-06 | 2006-08-08 | Vmware, Inc. | Virtualization system for computers with a region-based memory architecture |
US6895491B2 (en) * | 2002-09-26 | 2005-05-17 | Hewlett-Packard Development Company, L.P. | Memory addressing for a virtual machine implementation on a computer processor supporting virtual hash-page-table searching |
US20040098544A1 (en) | 2002-11-15 | 2004-05-20 | Gaither Blaine D. | Method and apparatus for managing a memory system |
US7900017B2 (en) * | 2002-12-27 | 2011-03-01 | Intel Corporation | Mechanism for remapping post virtual machine memory pages |
US7111145B1 (en) | 2003-03-25 | 2006-09-19 | Vmware, Inc. | TLB miss fault handler and method for accessing multiple page tables |
US7103808B2 (en) * | 2003-04-10 | 2006-09-05 | International Business Machines Corporation | Apparatus for reporting and isolating errors below a host bridge |
US9020801B2 (en) * | 2003-08-11 | 2015-04-28 | Scalemp Inc. | Cluster-based operating system-agnostic virtual computing system |
US20050044301A1 (en) | 2003-08-20 | 2005-02-24 | Vasilevsky Alexander David | Method and apparatus for providing virtual computing services |
EP1678583A4 (en) | 2003-10-08 | 2008-04-30 | Unisys Corp | Virtual data center that allocates and manages system resources across multiple nodes |
US20060010276A1 (en) * | 2004-07-08 | 2006-01-12 | International Business Machines Corporation | Isolation of input/output adapter direct memory access addressing domains |
US7398427B2 (en) * | 2004-07-08 | 2008-07-08 | International Business Machines Corporation | Isolation of input/output adapter error domains |
US7266631B2 (en) * | 2004-07-29 | 2007-09-04 | International Business Machines Corporation | Isolation of input/output adapter traffic class/virtual channel and input/output ordering domains |
US7009871B1 (en) * | 2004-08-18 | 2006-03-07 | Kabushiki Kaisha Toshiba | Stable memory cell |
US7340582B2 (en) * | 2004-09-30 | 2008-03-04 | Intel Corporation | Fault processing for direct memory access address translation |
US7334107B2 (en) * | 2004-09-30 | 2008-02-19 | Intel Corporation | Caching support for direct memory access address translation |
US7444493B2 (en) * | 2004-09-30 | 2008-10-28 | Intel Corporation | Address translation for input/output devices using hierarchical translation tables |
DE602004027516D1 (en) | 2004-12-03 | 2010-07-15 | St Microelectronics Srl | A method for managing virtual machines in a physical processing machine, a corresponding processor system and computer program product therefor |
US8706942B2 (en) * | 2004-12-29 | 2014-04-22 | Intel Corporation | Direct memory access (DMA) address translation between peer-to-peer input/output (I/O) devices |
US7415035B1 (en) * | 2005-04-04 | 2008-08-19 | Sun Microsystems, Inc. | Device driver access method into a virtualized network interface |
-
2005
- 2005-06-21 US US11/157,675 patent/US20060288130A1/en not_active Abandoned
-
2006
- 2006-06-20 WO PCT/US2006/024515 patent/WO2007002425A1/en active Application Filing
- 2006-06-20 DE DE112006001642T patent/DE112006001642T5/en not_active Ceased
- 2006-06-20 KR KR1020077029979A patent/KR101060395B1/en active IP Right Grant
- 2006-06-20 TW TW095122090A patent/TWI363967B/en active
- 2006-06-20 GB GB0722953A patent/GB2441084A/en not_active Withdrawn
- 2006-06-20 CN CN2006800221864A patent/CN101203838B/en active Active
-
2009
- 2009-12-29 US US12/648,461 patent/US7984203B2/en not_active Expired - Fee Related
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5890220A (en) * | 1991-02-05 | 1999-03-30 | Hitachi, Ltd. | Address conversion apparatus accessible to both I/O devices and processor and having a reduced number of index buffers |
US6820207B2 (en) * | 2001-03-01 | 2004-11-16 | International Business Machines Corporation | Method for rebooting only a specific logical partition in a data processing system as per a request for reboot |
US6665759B2 (en) * | 2001-03-01 | 2003-12-16 | International Business Machines Corporation | Method and apparatus to implement logical partitioning of PCI I/O slots |
US6804741B2 (en) * | 2002-01-16 | 2004-10-12 | Hewlett-Packard Development Company, L.P. | Coherent memory mapping tables for host I/O bridge |
US7058768B2 (en) * | 2002-04-17 | 2006-06-06 | Microsoft Corporation | Memory isolation through address translation data edit control |
US6986006B2 (en) * | 2002-04-17 | 2006-01-10 | Microsoft Corporation | Page granular curtained memory via mapping control |
US6725284B2 (en) * | 2002-04-25 | 2004-04-20 | International Business Machines Corporation | Logical partition hosted virtual input/output using shared translation control entries |
US6941436B2 (en) * | 2002-05-09 | 2005-09-06 | International Business Machines Corporation | Method and apparatus for managing memory blocks in a logical partitioned data processing system |
US7069413B1 (en) * | 2003-01-29 | 2006-06-27 | Vmware, Inc. | Method and system for performing virtual to physical address translations in a virtual machine monitor |
US7117385B2 (en) * | 2003-04-21 | 2006-10-03 | International Business Machines Corporation | Method and apparatus for recovery of partitions in a logical partitioned data processing system |
US7467381B2 (en) * | 2003-12-16 | 2008-12-16 | Intel Corporation | Resource partitioning and direct access utilizing hardware support for virtualization |
US20060069899A1 (en) * | 2004-09-30 | 2006-03-30 | Ioannis Schoinas | Performance enhancement of address translation using translation tables covering large address spaces |
US7308551B2 (en) * | 2005-02-25 | 2007-12-11 | International Business Machines Corporation | System and method for managing metrics table per virtual port in a logically partitioned data processing system |
US20060206687A1 (en) * | 2005-03-08 | 2006-09-14 | Microsoft Corporation | Method and system for a second level address translation in a virtual machine environment |
US20060206658A1 (en) * | 2005-03-08 | 2006-09-14 | Microsoft Corporation | Method and system for a guest physical address virtualization in a virtual machine environment |
US7353360B1 (en) * | 2005-04-05 | 2008-04-01 | Sun Microsystems, Inc. | Method for maximizing page locality |
US7225287B2 (en) * | 2005-06-01 | 2007-05-29 | Microsoft Corporation | Scalable DMA remapping on a computer bus |
Cited By (88)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8850098B2 (en) | 2004-12-29 | 2014-09-30 | Intel Corporation | Direct memory access (DMA) address translation between peer input/output (I/O) devices |
US20100100649A1 (en) * | 2004-12-29 | 2010-04-22 | Rajesh Madukkarumukumana | Direct memory access (DMA) address translation between peer input/output (I/O) devices |
US7756943B1 (en) * | 2006-01-26 | 2010-07-13 | Symantec Operating Corporation | Efficient data transfer between computers in a virtual NUMA system using RDMA |
US20070226450A1 (en) * | 2006-02-07 | 2007-09-27 | International Business Machines Corporation | Method and system for unifying memory access for CPU and IO operations |
US7739474B2 (en) * | 2006-02-07 | 2010-06-15 | International Business Machines Corporation | Method and system for unifying memory access for CPU and IO operations |
US20070214339A1 (en) * | 2006-03-10 | 2007-09-13 | Microsoft Corporation | Selective address translation for a resource such as a hardware device |
US20070233455A1 (en) * | 2006-03-28 | 2007-10-04 | Zimmer Vincent J | Techniques for unified management communication for virtualization systems |
US7840398B2 (en) * | 2006-03-28 | 2010-11-23 | Intel Corporation | Techniques for unified management communication for virtualization systems |
US7685371B1 (en) * | 2006-04-19 | 2010-03-23 | Nvidia Corporation | Hierarchical flush barrier mechanism with deadlock avoidance |
US20080104320A1 (en) * | 2006-10-26 | 2008-05-01 | Via Technologies, Inc. | Chipset and northbridge with raid access |
US7805567B2 (en) * | 2006-10-26 | 2010-09-28 | Via Technologies, Inc. | Chipset and northbridge with raid access |
US20080114906A1 (en) * | 2006-11-13 | 2008-05-15 | Hummel Mark D | Efficiently Controlling Special Memory Mapped System Accesses |
US7849287B2 (en) * | 2006-11-13 | 2010-12-07 | Advanced Micro Devices, Inc. | Efficiently controlling special memory mapped system accesses |
US7873770B2 (en) * | 2006-11-13 | 2011-01-18 | Globalfoundries Inc. | Filtering and remapping interrupts |
US20080114916A1 (en) * | 2006-11-13 | 2008-05-15 | Hummel Mark D | Filtering and Remapping Interrupts |
DE102007062744B4 (en) | 2006-12-27 | 2018-09-06 | Intel Corporation | Guest-to-host address translation for accessing devices on storage in a partitioned system |
DE102007063946B4 (en) | 2006-12-27 | 2024-04-25 | Intel Corporation | Guest-host address translation for device access to storage in a partitioned system |
US8955062B2 (en) | 2007-02-09 | 2015-02-10 | Marvell World Trade Ltd. | Method and system for permitting access to resources based on instructions of a code tagged with an identifier assigned to a domain |
US8677457B2 (en) * | 2007-02-09 | 2014-03-18 | Marvell World Trade Ltd. | Security for codes running in non-trusted domains in a processor core |
US20110126265A1 (en) * | 2007-02-09 | 2011-05-26 | Fullerton Mark N | Security for codes running in non-trusted domains in a processor core |
TWI386811B (en) * | 2007-07-31 | 2013-02-21 | Intel Corp | Technology for offloading input/output virtualization operations to processors |
US8250254B2 (en) | 2007-07-31 | 2012-08-21 | Intel Corporation | Offloading input/output (I/O) virtualization operations to a processor |
US20090037614A1 (en) * | 2007-07-31 | 2009-02-05 | Ramakrishna Saripalli | Offloading input/output (I/O) virtualization operations to a processor |
US20090235249A1 (en) * | 2008-03-11 | 2009-09-17 | Yuji Kobayashi | Virtual computer system and method of controlling the same |
US8893122B2 (en) * | 2008-03-11 | 2014-11-18 | Hitachi, Ltd. | Virtual computer system and a method of controlling a virtual computer system on movement of a virtual computer |
US8918657B2 (en) | 2008-09-08 | 2014-12-23 | Virginia Tech Intellectual Properties | Systems, devices, and/or methods for managing energy usage |
GB2466711A (en) * | 2008-12-31 | 2010-07-07 | Intel Corp | Efficient guest physical address to host physical address remapping engine utilization |
US20100169673A1 (en) * | 2008-12-31 | 2010-07-01 | Ramakrishna Saripalli | Efficient remapping engine utilization |
US20100228945A1 (en) * | 2009-03-04 | 2010-09-09 | Freescale Semiconductor, Inc. | Access management technique with operation translation capability |
US20100228943A1 (en) * | 2009-03-04 | 2010-09-09 | Freescale Semiconductor, Inc. | Access management technique for storage-efficient mapping between identifier domains |
US8473644B2 (en) | 2009-03-04 | 2013-06-25 | Freescale Semiconductor, Inc. | Access management technique with operation translation capability |
US20110320638A1 (en) * | 2010-06-23 | 2011-12-29 | International Business Machines Corporation | Enable/disable adapters of a computing environment |
US8745292B2 (en) | 2010-06-23 | 2014-06-03 | International Business Machines Corporation | System and method for routing I/O expansion requests and responses in a PCIE architecture |
US8505032B2 (en) | 2010-06-23 | 2013-08-06 | International Business Machines Corporation | Operating system notification of actions to be taken responsive to adapter events |
US8504754B2 (en) | 2010-06-23 | 2013-08-06 | International Business Machines Corporation | Identification of types of sources of adapter interruptions |
US8510599B2 (en) | 2010-06-23 | 2013-08-13 | International Business Machines Corporation | Managing processing associated with hardware events |
US8549182B2 (en) | 2010-06-23 | 2013-10-01 | International Business Machines Corporation | Store/store block instructions for communicating with adapters |
US8566480B2 (en) | 2010-06-23 | 2013-10-22 | International Business Machines Corporation | Load instruction for communicating with adapters |
US8572635B2 (en) | 2010-06-23 | 2013-10-29 | International Business Machines Corporation | Converting a message signaled interruption into an I/O adapter event notification |
US8601497B2 (en) | 2010-06-23 | 2013-12-03 | International Business Machines Corporation | Converting a message signaled interruption into an I/O adapter event notification |
US8615622B2 (en) | 2010-06-23 | 2013-12-24 | International Business Machines Corporation | Non-standard I/O adapters in a standardized I/O architecture |
US8615645B2 (en) | 2010-06-23 | 2013-12-24 | International Business Machines Corporation | Controlling the selectively setting of operational parameters for an adapter |
US8621112B2 (en) | 2010-06-23 | 2013-12-31 | International Business Machines Corporation | Discovery by operating system of information relating to adapter functions accessible to the operating system |
US8626970B2 (en) | 2010-06-23 | 2014-01-07 | International Business Machines Corporation | Controlling access by a configuration to an adapter function |
WO2011160708A1 (en) | 2010-06-23 | 2011-12-29 | International Business Machines Corporation | Multiple address spaces per adapter |
US8631222B2 (en) | 2010-06-23 | 2014-01-14 | International Business Machines Corporation | Translation of input/output addresses to memory addresses |
US8635430B2 (en) | 2010-06-23 | 2014-01-21 | International Business Machines Corporation | Translation of input/output addresses to memory addresses |
US8639858B2 (en) * | 2010-06-23 | 2014-01-28 | International Business Machines Corporation | Resizing address spaces concurrent to accessing the address spaces |
US8645767B2 (en) | 2010-06-23 | 2014-02-04 | International Business Machines Corporation | Scalable I/O adapter function level error detection, isolation, and reporting |
US8645606B2 (en) | 2010-06-23 | 2014-02-04 | International Business Machines Corporation | Upbound input/output expansion request and response processing in a PCIe architecture |
US8650337B2 (en) | 2010-06-23 | 2014-02-11 | International Business Machines Corporation | Runtime determination of translation formats for adapter functions |
US8650335B2 (en) | 2010-06-23 | 2014-02-11 | International Business Machines Corporation | Measurement facility for adapter functions |
US8656228B2 (en) | 2010-06-23 | 2014-02-18 | International Business Machines Corporation | Memory error isolation and recovery in a multiprocessor computer system |
US8671287B2 (en) | 2010-06-23 | 2014-03-11 | International Business Machines Corporation | Redundant power supply configuration for a data center |
US8677180B2 (en) | 2010-06-23 | 2014-03-18 | International Business Machines Corporation | Switch failover control in a multiprocessor computer system |
US8468284B2 (en) | 2010-06-23 | 2013-06-18 | International Business Machines Corporation | Converting a message signaled interruption into an I/O adapter event notification to a guest operating system |
US8683108B2 (en) | 2010-06-23 | 2014-03-25 | International Business Machines Corporation | Connected input/output hub management |
US9626298B2 (en) | 2010-06-23 | 2017-04-18 | International Business Machines Corporation | Translation of input/output addresses to memory addresses |
US9383931B2 (en) | 2010-06-23 | 2016-07-05 | International Business Machines Corporation | Controlling the selectively setting of operational parameters for an adapter |
US9342352B2 (en) | 2010-06-23 | 2016-05-17 | International Business Machines Corporation | Guest access to address spaces of adapter |
US8700959B2 (en) | 2010-06-23 | 2014-04-15 | International Business Machines Corporation | Scalable I/O adapter function level error detection, isolation, and reporting |
US8478922B2 (en) | 2010-06-23 | 2013-07-02 | International Business Machines Corporation | Controlling a rate at which adapter interruption requests are processed |
US8769180B2 (en) | 2010-06-23 | 2014-07-01 | International Business Machines Corporation | Upbound input/output expansion request and response processing in a PCIe architecture |
US8458387B2 (en) | 2010-06-23 | 2013-06-04 | International Business Machines Corporation | Converting a message signaled interruption into an I/O adapter event notification to a guest operating system |
US8457174B2 (en) | 2010-06-23 | 2013-06-04 | International Business Machines Corporation | Spread spectrum wireless communication code for data center environments |
US8416834B2 (en) | 2010-06-23 | 2013-04-09 | International Business Machines Corporation | Spread spectrum wireless communication code for data center environments |
US8918573B2 (en) | 2010-06-23 | 2014-12-23 | International Business Machines Corporation | Input/output (I/O) expansion response processing in a peripheral component interconnect express (PCIe) environment |
US8417911B2 (en) | 2010-06-23 | 2013-04-09 | International Business Machines Corporation | Associating input/output device requests with memory associated with a logical partition |
US9134911B2 (en) | 2010-06-23 | 2015-09-15 | International Business Machines Corporation | Store peripheral component interconnect (PCI) function controls instruction |
US9298659B2 (en) | 2010-06-23 | 2016-03-29 | International Business Machines Corporation | Input/output (I/O) expansion response processing in a peripheral component interconnect express (PCIE) environment |
US9213661B2 (en) * | 2010-06-23 | 2015-12-15 | International Business Machines Corporation | Enable/disable adapters of a computing environment |
US9201830B2 (en) | 2010-06-23 | 2015-12-01 | International Business Machines Corporation | Input/output (I/O) expansion response processing in a peripheral component interconnect express (PCIe) environment |
US9195623B2 (en) | 2010-06-23 | 2015-11-24 | International Business Machines Corporation | Multiple address spaces per adapter with address translation |
US9146863B2 (en) * | 2010-12-08 | 2015-09-29 | International Business Machines Corporation | Address translation table to enable access to virtualized functions |
US20120151471A1 (en) * | 2010-12-08 | 2012-06-14 | International Business Machines Corporation | Address translation table to enable access to virtualized functions |
US8631212B2 (en) | 2011-09-25 | 2014-01-14 | Advanced Micro Devices, Inc. | Input/output memory management unit with protection mode for preventing memory access by I/O devices |
US9183041B2 (en) * | 2012-09-21 | 2015-11-10 | International Business Machines Corporation | Input/output traffic backpressure prediction |
US20140089607A1 (en) * | 2012-09-21 | 2014-03-27 | International Business Machines Corporation | Input/output traffic backpressure prediction |
US20140089621A1 (en) * | 2012-09-21 | 2014-03-27 | International Business Machines Corporation | Input/output traffic backpressure prediction |
US9183042B2 (en) * | 2012-09-21 | 2015-11-10 | International Business Machines Corporation | Input/output traffic backpressure prediction |
US10430347B2 (en) * | 2012-09-25 | 2019-10-01 | International Business Machines Corporation | Power savings via dynamic page type selection |
US20140089631A1 (en) * | 2012-09-25 | 2014-03-27 | International Business Machines Corporation | Power savings via dynamic page type selection |
US10303618B2 (en) | 2012-09-25 | 2019-05-28 | International Business Machines Corporation | Power savings via dynamic page type selection |
US20170277530A1 (en) * | 2016-03-24 | 2017-09-28 | Intel Corporation | Technologies for securing a firmware update |
US10496388B2 (en) * | 2016-03-24 | 2019-12-03 | Intel Corporation | Technologies for securing a firmware update |
US10394711B2 (en) * | 2016-11-30 | 2019-08-27 | International Business Machines Corporation | Managing lowest point of coherency (LPC) memory using a service layer adapter |
US12072813B2 (en) * | 2021-10-22 | 2024-08-27 | Shanghai Zhaoxin Semiconductor Co., Ltd. | Method for remapping virtual address to physical address and address remapping unit |
CN117331861A (en) * | 2023-11-28 | 2024-01-02 | 珠海星云智联科技有限公司 | Direct memory mapping method, device, equipment, cluster and medium |
Also Published As
Publication number | Publication date |
---|---|
WO2007002425A1 (en) | 2007-01-04 |
GB0722953D0 (en) | 2008-01-02 |
US7984203B2 (en) | 2011-07-19 |
US20100100648A1 (en) | 2010-04-22 |
DE112006001642T5 (en) | 2008-05-08 |
CN101203838A (en) | 2008-06-18 |
KR101060395B1 (en) | 2011-08-29 |
TWI363967B (en) | 2012-05-11 |
TW200712893A (en) | 2007-04-01 |
KR20080012988A (en) | 2008-02-12 |
CN101203838B (en) | 2010-06-23 |
GB2441084A (en) | 2008-02-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7984203B2 (en) | Address window support for direct memory access translation | |
US7444493B2 (en) | Address translation for input/output devices using hierarchical translation tables | |
US8843727B2 (en) | Performance enhancement of address translation using translation tables covering large address spaces | |
US7334107B2 (en) | Caching support for direct memory access address translation | |
US7340582B2 (en) | Fault processing for direct memory access address translation | |
US7613898B2 (en) | Virtualizing an IOMMU | |
EP2457166B1 (en) | I/o memory management unit including multilevel address translation for i/o and computation offload | |
EP2457165B1 (en) | Iommu using two-level address translation for i/o and computation offload devices on a peripheral interconnect | |
US7937534B2 (en) | Performing direct cache access transactions based on a memory access data structure | |
US20080162864A1 (en) | Guest to host address translation for devices to access memory in a partitioned system | |
JP2007183952A (en) | Method by which guest is accessing memory converted device and apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTEL CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MADUKKARUMUKUMANA, RAJESH;STEINBERG, UDO A.;BENNETT, STEVEN M.;AND OTHERS;REEL/FRAME:016718/0107;SIGNING DATES FROM 20050610 TO 20050617 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |