CN101216932A - Graphics processing apparatus, unit and method for performing triangle configuration and attribute configuration - Google Patents
Graphics processing apparatus, unit and method for performing triangle configuration and attribute configuration Download PDFInfo
- Publication number
- CN101216932A CN101216932A CNA2008100018156A CN200810001815A CN101216932A CN 101216932 A CN101216932 A CN 101216932A CN A2008100018156 A CNA2008100018156 A CN A2008100018156A CN 200810001815 A CN200810001815 A CN 200810001815A CN 101216932 A CN101216932 A CN 101216932A
- Authority
- CN
- China
- Prior art keywords
- thread
- performance element
- attribute configuration
- configuration
- triangular arrangement
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Image Generation (AREA)
Abstract
Description
技术领域technical field
本发明的内容是关于计算机图形系统,且更特定而言,关于图形管线(graphics pipeline)的三角形配置以及属性配置阶段的系统以及方法。The present invention relates to computer graphics systems, and more particularly to systems and methods for the triangle configuration and attribute configuration stages of a graphics pipeline.
背景技术Background technique
众所周知,三维(“3-D”)计算机图形的技术以及科学是关于3-D物件的二维(“2-D”)影像的产生或再现,以显示或呈现于显示装置或监视器上,诸如阴极射线管(CathodeRay Tube,CRT)或液晶显示器(Liquid Crystal Display,LCD)。物件可为简单几何基元,诸如,点、线段、三角形或多角形。通过以一连串连接的平面多角形来表示物件,诸如,通过将物件表示为一连串连接的平面三角形,可将较复杂的物件再现于显示装置上。所有的几何基元可最终由一顶点或一组顶点(例如,界定点(例如,线段的端点或多角形的角)的坐标(X,Y,Z))来描述。As is well known, the art and science of three-dimensional ("3-D") computer graphics is concerned with the generation or reproduction of two-dimensional ("2-D") images of 3-D objects for display or representation on display devices or monitors, Such as cathode ray tube (CathodeRay Tube, CRT) or liquid crystal display (Liquid Crystal Display, LCD). Objects can be simple geometric primitives such as points, line segments, triangles or polygons. By representing an object as a series of connected planar polygons, such as by representing an object as a series of connected planar triangles, more complex objects can be rendered on the display device. All geometric primitives can ultimately be described by a vertex or a set of vertices (eg, coordinates (X, Y, Z) that define a point (eg, an endpoint of a line segment or a corner of a polygon).
为了产生作为表示3-D基元的2-D投影而显示于计算机监视器或其他显示装置上的数据组,基元的顶点可经由图形再现管线中的一连串操作或处理阶段来处理。图形管线仅为一连串的处理单元或阶段,其中来自先前阶段的输出可作为后续阶段的输入。举例而言,在图形处理器的内容操作阶段,此等阶段包括每顶点(per-vertex)操作,基元组件操作、像素操作、纹理组件操作、再现处理操作以及片段操作。To produce a data set displayed on a computer monitor or other display device as a 2-D projection representing a 3-D primitive, the primitive's vertices may be processed through a series of operations or processing stages in the graphics rendering pipeline. A graphics pipeline is simply a series of processing units, or stages, where output from previous stages can be used as input to subsequent stages. For example, in the content operation phase of a graphics processor, these phases include per-vertex operations, primitive component operations, pixel operations, texture component operations, rendering processing operations, and fragment operations.
在典型的图形显示系统中,影像数据库(例如,命令清单)可储存一场景的物件的描述子。物件是通过以众多可覆盖物件的表面的小多角形来描述,如同小砖块覆盖墙壁或其他表面的相同方式。每一多角形被描述为顶点坐标(在“模型”坐标中的X、Y、Z)的清单与材料表面特性(亦即,色彩、纹理、光泽度等)的一些规格,以及在每一顶点处相对于表面的法向向量(normal vector)。对于具有复杂弯曲表面的三维物件,一般而言,多角形必须为三角形或四边形,且后者始终可分解为成对的三角形。In a typical graphics display system, an image database (eg, a command list) may store descriptors for objects of a scene. Objects are described by a number of small polygons that can cover the surface of the object in the same way that small bricks cover a wall or other surface. Each polygon is described as a list of vertex coordinates (X, Y, Z in "model" coordinates) and some specification of material surface properties (i.e., color, texture, glossiness, etc.), and at each vertex The normal vector relative to the surface. For 3D objects with complex curved surfaces, polygons must generally be either triangles or quadrilaterals, with the latter always decomposed into pairs of triangles.
对应于使用者自使用者输入选择的检视角度,转换引擎(transformation engine)可转换物件坐标。另外,使用者可指定视野、待产生的影像的大小以及视野体积的后端,以按需要包括或消除背景。A transformation engine may transform object coordinates corresponding to a viewing angle selected by the user from the user input. In addition, the user can specify the field of view, the size of the image to be generated, and the back end of the field of view volume to include or eliminate the background as desired.
一旦已选择了此检视区(viewing area),裁剪(clipping)逻辑电路消除处于检视区外的多角形(亦即,三角形)且“裁剪”部分处于检视区内且部分处于检视区外的多角形。此等经裁剪的多角形对应于多角形处于检视区内的部分,其中新的边缘对应于检视区的边缘。多角形顶点接着以对应至检视屏幕的坐标(X、Y坐标)以及每一顶点对应的深度值(Z坐标)被传输至下一个阶段。在典型系统中,接下来根据光源而加上照明模型,然后将多角形以及其色值传输至再现处理器。Once the viewing area has been selected, clipping logic eliminates polygons (i.e., triangles) that are outside the viewing area and "clipping" polygons that are partially inside the viewing area and partially outside the viewing area . These clipped polygons correspond to the portion of the polygon that is within the viewport, where the new edges correspond to the edges of the viewport. The polygon vertices are then passed to the next stage with coordinates corresponding to the viewing screen (X, Y coordinates) and the corresponding depth value (Z coordinate) for each vertex. In a typical system, an illumination model is next applied according to the light source, and the polygons and their color values are then passed to the rendering processor.
对于每一多角形,再现处理器判定哪些像素位置被多角形覆盖,且试图将相关的色值以及深度值(Z值)写入至帧缓冲器(frame buffer)中。再现处理器将正处理的多角形的深度值(Z)与一像素的深度值(其可能已被写入至帧缓冲器中)比较。若新的多角形像素的深度值较小,表示其处于已写入至帧缓冲器的多角形的前端,则其值将替代帧缓冲器中的值,因为新的多角形将遮掩先前经处理且写入至帧缓冲器中的多角形。此过程会一直重复至已再现处理所有多角形为止。此时,视频控制器将帧缓冲器的内容按再现次序一次一扫描线地显示于显示器上。For each polygon, the rendering processor determines which pixel locations are covered by the polygon, and attempts to write the associated color and depth values (Z values) into the frame buffer. The rendering processor compares the depth value (Z) of the polygon being processed to the depth value of a pixel (which may have been written into the frame buffer). If the new polygon pixel has a lower depth value, indicating that it is in front of the polygon already written to the framebuffer, its value will replace the value in the framebuffer, because the new polygon will obscure the previously processed And write to the polygon in the framebuffer. This process is repeated until all polygons have been reproduced. At this point, the video controller displays the contents of the frame buffer on the display one scan line at a time in reproduction order.
执行即时再现的预设方法通常是将多角形显示为位于多角形之内或外的像素。界定多角形的边缘在静态显示器中看起来可能具有锯齿状外观,而在动画显示器中看起来为一拖曳外观。产生此效应的潜在问题称为偏移(aliasing),且经应用以减少或消除问题的方法称为反偏移(anti aliasing)技术。The default method for performing instant rendering is usually to display polygons as pixels that lie inside or outside the polygon. The edges bounding the polygon may appear to have a jagged appearance in static displays and a dragging appearance in animated displays. The underlying problem that creates this effect is called aliasing, and the methods applied to reduce or eliminate the problem are called anti-aliasing techniques.
针对屏幕显像的反偏移方法并不需要知晓正在再现的物件,因为其仅使用管线输出样本。一种典型的反偏移方法利用一种被称为多样本反偏移(Multi-Sample Anti-Aliasing,MSAA)的线性反偏移技术,其在单一传输中每像素采样一个以上样本。每一像素需要的样本或子像素的数目被称为取样率,且理论上,当取样率增加时,相关的存储器信息量亦增加。The demisting method for screen rendering does not need to know what is being rendered, since it only uses the pipeline output samples. A typical demigration method utilizes a linear demigration technique known as Multi-Sample Anti-Aliasing (MSAA), which samples more than one sample per pixel in a single transmission. The number of samples or sub-pixels required for each pixel is called the sampling rate, and in theory, as the sampling rate increases, the associated amount of memory information also increases.
虽然前述内容已简要地概括了各种处理组件的操作,但本领域技术人员应认识到,关于图形数据的处理需相当地加强。因此,只要有可能,则需要改良处理、设计以及制造效率。图形管线的固定功能阶段,诸如三角形配置以及属性配置,是用于图形管线中的几何基元以及像素的处理所必须的。此等包括在已知图形处理单元中的固定功能阶段是在固定功能硬件组件或专用硬件中来执行。一般使用的单独的三角形配置以及属性配置单元需要相当数目的门、通信线以及硬件成本。另外,更改图形管线的三角形配置以及属性配置阶段需要对此等昂贵的硬件组件进行改变。因此,存在至今未解决的需求来克服先前技术的不足。While the foregoing has briefly outlined the operation of the various processing components, those skilled in the art will recognize that processing with respect to graphics data requires considerable enhancement. Therefore, improvements in processing, design, and manufacturing efficiencies are desired, wherever possible. Fixed-function stages of the graphics pipeline, such as triangle configuration and attribute configuration, are necessary for the processing of geometric primitives and pixels in the graphics pipeline. Such fixed-function stages included in known graphics processing units are implemented in fixed-function hardware components or in dedicated hardware. The individual triangle configurations and attribute configuration cells typically used require a considerable number of gates, communication lines, and hardware costs. Additionally, changing the triangle configuration and attribute configuration stages of the graphics pipeline requires changes to such expensive hardware components. Accordingly, there is a heretofore unaddressed need to overcome the deficiencies of the prior art.
发明内容Contents of the invention
本发明是关于实施图形管线的三角形配置以及属性配置阶段的系统以及方法。简言之,本发明的一系统的实施例其架构可如下实现:此系统包括至少一执行单元,此执行单元可用于多线程操作,其中此执行单元可执行用于三角形配置操作以及属性配置操作的至少一线程。此执行单元是可编程化的以执行至少一线程以用于选自以下的至少一个:顶点着色器(vertexshader)操作、像素着色器(pixel shader)操作以及几何着色器(geometry shader)操作。此执行单元更可中止为三角形配置操作以及属性配置操作所建立的至少一线程。此执行单元更可将来自三角形配置操作(来自至少一线程)的数据输出至执行单元外的至少一硬件组件,所述可编程化三角形配置操作来自所述至少一线程。当接收到对应于至少一线程的数据时,此执行单元更可恢复中止的线程。最后,此执行单元更可将来自线程的结果数据储存于至少一执行单元内的缓冲器中,以供由该执行单元所建立的随后线程来使用。The present invention relates to systems and methods for implementing the triangle configuration and attribute configuration stages of a graphics pipeline. In short, the architecture of a system embodiment of the present invention can be implemented as follows: the system includes at least one execution unit, and the execution unit can be used for multi-threaded operations, wherein the execution unit can perform triangle configuration operations and attribute configuration operations at least one thread of . The execution unit is programmable to execute at least one thread for at least one selected from: vertex shader operations, pixel shader operations, and geometry shader operations. The execution unit is further capable of stopping at least one thread established for the triangle configuration operation and the attribute configuration operation. The execution unit can further output data from triangle configuration operations (from at least one thread), the programmable triangle configuration operations from the at least one thread, to at least one hardware component outside the execution unit. The execution unit may further resume the suspended thread when data corresponding to at least one thread is received. Finally, the execution unit may store result data from threads in at least one buffer within the execution unit for use by subsequent threads created by the execution unit.
本发明另提供一种图形处理单元,包括:至少一执行单元,所述至少一执行单元可用于多线程操作,其中所述至少一执行单元可执行用于三角形配置操作以及属性配置操作的至少一线程,且所述执行单元可用以执行可编程着色器操作;以及一执行单元集区控制系统,用以排程与管理所述至少一执行单元的所述至少一线程;其中所述执行单元集区控制系统可同时起始用于所述三角形配置操作、所述属性配置操作以及一可编程着色器操作的所述至少一线程。The present invention further provides a graphics processing unit, including: at least one execution unit, the at least one execution unit can be used for multi-thread operation, wherein the at least one execution unit can execute at least one of the triangle configuration operation and the attribute configuration operation threads, and the execution units are operable to perform programmable shader operations; and an execution unit pool control system for scheduling and managing the at least one thread of the at least one execution unit; wherein the execution unit set The region control system may simultaneously initiate the at least one thread for the triangle configuration operation, the attribute configuration operation, and a programmable shader operation.
本发明的方法的一实施例包括接收顶点数据的步骤,此顶点数据对应于几何基元。此实施例更包括在可用于多线程操作的一执行单元内建立一线程,其中此执行单元可执行可编程着色器操作。此实施例更包括在执行线程内对顶点数据执行三角形配置操作。最后,此实施例包括在此线程内执行属性配置操作以产生相关顶点数据识别的像素属性并终止线程。An embodiment of the method of the present invention includes the step of receiving vertex data corresponding to geometric primitives. This embodiment further includes creating a thread within an execution unit that can be used for multi-threaded operations, wherein the execution unit can perform programmable shader operations. This embodiment further includes performing triangle configuration operations on the vertex data within the execution thread. Finally, this embodiment includes performing attribute configuration operations within the thread to generate pixel attributes identified by the associated vertex data and terminating the thread.
本发明所述的图形处理装置、单元与执行三角形配置、属性配置的方法,可移除至少部分硬件组件,进而减少系统中的门的数量,并导致更有效的图形管线,对于程序错误的修改、新特征的添加或演算法的调整,具有灵活性以及可扩展性。The graphics processing device, the unit and the method for performing triangle configuration and attribute configuration according to the present invention can remove at least part of the hardware components, thereby reducing the number of gates in the system, and leading to a more efficient graphics pipeline, and correcting for program errors , the addition of new features or the adjustment of algorithms, with flexibility and scalability.
附图说明Description of drawings
图1描绘计算机图形系统中图形管线内的某些组件的功能流程图。Figure 1 depicts a functional flow diagram of certain components within a graphics pipeline in a computer graphics system.
图2描绘说明图形系统的固定功能以及可编程组件的方块图。Figure 2 depicts a block diagram illustrating the fixed-function as well as programmable components of a graphics system.
图3描绘说明图形处理单元以及图形处理单元的某些内部组件的功能方块图。3 depicts a functional block diagram illustrating a graphics processing unit and certain internal components of the graphics processing unit.
图4描绘说明图形系统的某固定功能以及可编程组件的方块图。FIG. 4 depicts a block diagram illustrating certain fixed functions as well as programmable components of a graphics system.
图5描绘说明图形处理单元以及图形处理单元的某些内部组件的功能方块图。5 depicts a functional block diagram illustrating a graphics processing unit and certain internal components of the graphics processing unit.
图6描绘根据本发明揭露内容的实施例的方法的流程图。Figure 6 depicts a flowchart of a method according to an embodiment of the present disclosure.
具体实施方式Detailed ways
以下将对本发明的各种实施例进行详细描述(如图式中所说明)。虽然此等图式描述了若干实施例,但并非用以将本发明的内容限于本文中揭露的一或多个实施例。相反地,本发明的范围可涵盖所有的替代、修改以及其等效物。Various embodiments of the invention, as illustrated in the drawings, are described in detail below. While the drawings depict several embodiments, they are not intended to limit the disclosure to the one or more embodiments disclosed herein. On the contrary, the scope of the present invention may cover all alternatives, modifications and equivalents thereof.
如上,本发明是关于一种用于将三角形配置以及属性配置操作整合至可编程执行单元内的系统以及方法。在讨论各种实施例的实施细节前,首先参看图1,其说明图形管线100中的某些组件的方块图,此等组件可为本发明的实施例所利用或用于本发明的实施例中。图1所示的主要组件为顶点着色器110、几何着色器120、三角形配置单元130、跨距与像素片产生器(spanand tile generator)140、属性配置单元150、像素着色器160以及帧缓冲器170。本领域技术人员应可知且理解此等组件的一般功能以及操作,因此本文中无需对其详细地描述。然而,简言之,图形基元可由位置数据(例如,X、Y、Z以及W坐标)以及照明与纹理信息界定。此等所有信息可传递至顶点着色器110。如所知,顶点着色器110可对自命令清单所接收的图形数据执行各种转换。在此方面,可将数据自世界坐标(Worldcoordinate)转换为模型视野坐标(Model View coordinate)、再转换为投影坐标(Projection coordinate)以及最终转换为屏幕坐标(Screen coordinate)。顶点着色器110所执行的功能处理是本领域技术人员所已知的,无需在本文中进行进一步描述。顶点着色器110将几何基元输出至几何着色器120。As above, the present invention relates to a system and method for integrating triangle configuration and attribute configuration operations into a programmable execution unit. Before discussing implementation details of various embodiments, reference is first made to FIG. 1 , which illustrates a block diagram of certain components in a
几何着色器120所产生的几何数据以及其他图形数据被传送至三角形配置单元130,以执行三角形配置操作。三角形配置单元130的具体功能与实施细节可因不同实施例而有所不同。一般而言,可将三角形基元的相关顶点信息传递至三角形配置单元130,且可对由被传递至三角形配置单元130的图形数据所界定的各种基元执行操作。除了其他操作之外,可在三角形配置单元130内执行某些几何转换。The geometry data generated by the
对于一给定顶点,可提供诸如x、y、z以及w信息的几何数据(其中,x、y与z为几何坐标,且w为齐次坐标(homogeneouscoordinate))。如本领域技术人员已知,可进行各种转换,例如,自模型空间至世界空间(world space)、至眼睛空间、至投影空间、至齐次空间、至正规化装置坐标(normalized devicecoordinate)(或NDC),以及最后至屏幕空间(由视频端口转换执行)。应了解,本文的说明省略了图形管线的某些组件以易于描述以及清楚性,但本领域技术人员而言应可知悉。如一非限制性实例,为了清楚起见,省略了图形管线的再现处理管线的某些阶段,但一般本领域技术人员应了解,图形管线可包括其他阶段。For a given vertex, geometric data such as x, y, z, and w information may be provided (where x, y, and z are geometric coordinates, and w is a homogeneous coordinate). Various transformations can be performed as known to those skilled in the art, for example, from model space to world space, to eye space, to projective space, to homogeneous space, to normalized device coordinates ( or NDC), and finally to screen space (performed by video port conversion). It should be understood that the description herein omits some components of the graphics pipeline for ease of description and clarity, but those skilled in the art should know. As a non-limiting example, certain stages of the rendering processing pipeline of the graphics pipeline have been omitted for clarity, but those of ordinary skill in the art will appreciate that the graphics pipeline may include other stages.
现参看图2,其说明图形管线200的某些组件或阶段的方块图。第一个组件为命令流处理器(command stream processor)252,其基本上自存储器250接收或读取顶点,此顶点是用以形成几何基元以及为管线建立工作项。在此方面,命令流处理器252自存储器读取数据,以及从此数据产生待引入管线的三角形、线、点或其他基元。此几何信息一旦经组合,则被传递至顶点着色器254。在此顶点着色器254被表示为具有圆形边缘,于本发明中圆形边缘用以表示图形管线中通过执行可编程执行单元或执行单元集区(如图3中所描绘)中的指令来实现的所述阶段。如所知,顶点着色器254通过执行诸如转换、扫描以及照明的操作来处理顶点。其后,顶点着色器254将数据传递至几何着色器256。几何着色器256接收一完整基元的顶点作为输入,且能够输出形成单一拓扑(诸如,三角形条、线条、点清单等)的多个顶点。几何着色器256还可执行各种演算法,诸如镶嵌(tessellation)、阴影体(shadow volume)产生等。Referring now to FIG. 2 , a block diagram of certain components or stages of a
几何着色器256将信息输出至三角形配置单元257,如所已知,其执行诸如三角形琐细排斥、行列式计算、精选、预属性配置KLMN、边缘函数计算以及安全带裁剪的操作。一般本领域技术人员应了解三角形配置单元的必要操作,且无需进一步对其进行详细阐述。三角形配置单元257将信息输出至跨距与像素片产生器258。图形管线的此阶段在此项技术中是已知的,且无需进行进一步地详细讨论。然而,总结而论,若不必将此三角形再现至屏幕,则跨距与像素片产生器258会执行三角形的排斥操作。应了解,再现处理管线的其他元件可操作,诸如,图形管线的Z测试或其他固定功能元件。举例而言,可执行Z测试来判定三角形的深度以进一步判定是否应排斥三角形为不必再现至屏幕。然而,此等元件并未在本文中进一步讨论,因为其应是一般本领域技术人员所了解。
如果由三角形配置单元257处理的三角形未受到跨距与像素片产生器258或图形管线的其他阶段排斥,则图形管线的属性配置单元259将执行属性配置操作。属性配置单元259产生在管线的随后阶段中待判定的已知且需要的属性的内插变数的清单。此外,如所已知,属性配置单元259处理与正由图形管线处理的几何基元相关的各种属性。The
由属性配置单元259所输出的基元覆盖的每一像素需要经过像素着色器260的处理。众所周知,像素着色器260执行判定输出至帧缓冲器262的像素色彩的内插法以及其他操作。图2中说明的各种组件的操作对于本领域技术人员而言是熟知的,且在本文中无需进行进一步描述。因此,此等单元内部的具体实施以及操作无需在本文中描述。Each pixel covered by the primitives output by the
现参看图3,其描绘一实施例的图形处理单元(graphicsprocessing unit,GPU)300。此图形系统具有建立可编程着色器如几何着色器、像素着色器、顶点着色器或已知的其他着色器的能力。所述着色器由程序建立且可由多个可编程执行单元集区306(以下称为执行单元集区306)中的至少一个执行。应了解,执行单元集区306可包括能够进行多线程操作的处理核心。因此,执行单元集区306可发动分配给特定类型的着色器的一个以上线程。举例而言,执行单元集区306可对一组数据启动以及执行用于几何着色器310的线程,并同时对另一组启动另一条线程于顶点着色器308。关于执行单元集区的结构以及操作的实例,请参照2006年4月19日申请的同在申请中的美国申请案序号11/406,543。Referring now to FIG. 3 , a graphics processing unit (GPU) 300 of an embodiment is depicted. The graphics system has the ability to create programmable shaders such as geometry shaders, pixel shaders, vertex shaders or other known shaders. The shader is created by a program and is executable by at least one of a plurality of programmable execution unit pools 306 (hereinafter referred to as execution unit pools 306 ). It should be appreciated that execution unit pool 306 may include processing cores capable of multi-threaded operations. Accordingly, execution unit pool 306 may launch more than one thread assigned to a particular type of shader. For example, execution unit pool 306 may launch and execute a thread for geometry shader 310 on one set of data while simultaneously launching another thread for vertex shader 308 on another set. See co-pending US Application Serial No. 11/406,543, filed April 19, 2006, for an example of the structure and operation of the pool of execution units.
然而,总结以上结构,执行单元集区306中的每一执行单元能够在单一时脉周期内处理多个指令。因此,每一执行单元可同时处理多个线程。举例而言,如上所提到,执行单元可同时处理用于几何着色器操作的线程以及用于像素着色器操作的线程。排程器自多个着色器阶段接收进来的任务以执行与着色器相关的计算,且将其指派至具有能力的执行单元。执行单元集区306的执行单元内的线程经各个排程以执行与着色器相关的计算,使其可随着时间而排程给定的线程,以执行用于不同着色器阶段的着色器操作。此外,在给定执行单元内,可将某些线程指派至一着色器的任务,而同时可将其他线程指派至其他着色器单元的任务。以此方式,可平衡系统中的执行单元之间的负载以达成流量最佳化。类似地,可平衡执行单元集区内可利用线程之间的负载以使系统的流量最大化。由于先前技术图形系统使用专用着色器硬件,所以无法将诸如在以上结构中的稳固与动态线程管理用于图形系统。因此,无法实现此结构的图形系统的灵活性以及可扩展性。However, to summarize the above structure, each execution unit in the execution unit pool 306 is capable of processing multiple instructions within a single clock cycle. Therefore, each execution unit can process multiple threads simultaneously. For example, as mentioned above, an execution unit may concurrently process threads for geometry shader operations and threads for pixel shader operations. The scheduler receives incoming tasks from multiple shader stages to perform shader-related computations and dispatches them to capable execution units. Threads within the execution units of the execution unit pool 306 are individually scheduled to perform shader-related computations so that a given thread can be scheduled over time to perform shader operations for different shader stages . Furthermore, within a given execution unit, certain threads may be assigned to tasks of one shader, while other threads may be assigned to tasks of other shader units. In this way, the load can be balanced among the execution units in the system to achieve flow optimization. Similarly, the load among available threads within the pool of execution units can be balanced to maximize the throughput of the system. Since prior art graphics systems use dedicated shader hardware, robust and dynamic thread management such as in the above architecture cannot be used with graphics systems. Therefore, the flexibility and scalability of the graphics system of this structure cannot be realized.
执行单元集区控制与快取子系统304含有供执行单元集区306使用的电平二(1evel 2)快取存储器以及用以排程执行单元集区306的系统(未图示)。在此图形处理单元中,执行单元集区306与其外部组件之间的通信是通过执行单元集区控制与快取子系统304来进行,然而,如所已知亦可将其他线及/或通信链路直接建立至执行单元集区以有助于图形管线的执行。详言之,三角形配置单元314、属性配置单元316以及跨距与像素片产生器318为可经由执行单元集区控制与快取子系统304与执行单元集区306通信的固定功能硬件逻辑组件。Execution unit pool control and caching subsystem 304 contains level 2 cache memory for execution unit pool 306 and a system (not shown) for scheduling execution unit pool 306 . In this graphics processing unit, communication between the execution unit pool 306 and its external components is through the execution unit pool control and caching subsystem 304, however, other lines and/or communication lines may be used as is known. Links are established directly to the pool of execution units to facilitate execution of the graphics pipeline. Specifically, the triangle configuration unit 314 , the attribute configuration unit 316 , and the stride and tile generator 318 are fixed-function hardware logic components that communicate with the execution unit pool 306 via the execution unit pool control and cache subsystem 304 .
如以上参看图2所提到,为了清楚起见,已自图式省略了图形管线的某些组件。类似地,为了清楚起见,图3省略了图形处理单元300的某些组件;然而,一般本领域技术人员应了解可能需要其他组件。对于一般本领域技术人员而言,用于三角形配置、属性配置以及跨距产生器/像素片产生器的操作是已知的,且无需进行进一步详细讨论。如一实施例,三角形配置单元314执行诸如下列的操作:三角形琐细排斥、行列式计算、边界框计算、精选、预属性配置KLMN、边缘函数产生、裁剪以及安全带裁剪。类似地,属性配置单元316执行诸如对应于在制备像素着色器中以及像素着色器操作中的像素的属性的处理操作。As mentioned above with reference to FIG. 2, certain components of the graphics pipeline have been omitted from the diagram for clarity. Similarly, certain components of graphics processing unit 300 are omitted from FIG. 3 for clarity; however, those of ordinary skill in the art will appreciate that other components may be required. The operations for triangle configurations, attribute configurations, and span generators/tile generators are known to those of ordinary skill in the art and need not be discussed in further detail. As an example, the triangle configuration unit 314 performs operations such as triangular exclusion, determinant calculation, bounding box calculation, refinement, pre-attribute configuration KLMN, edge function generation, clipping, and belt clipping. Similarly, the attribute configuration unit 316 performs processing operations such as corresponding to attributes of pixels in preparing a pixel shader and in pixel shader operations.
现参看图4,其描绘本发明的一实施例的图形管线400。图4中描绘的图形管线400与先前技术中的图形管线具有不同创新。数据通常在管线中自命令流处理器452向下方移动。如上所提到,顶点着色器454具有圆形边缘,此表示其为通过执行可编程执行单元或执行单元集区中的指令而实施的图形管线的阶段。类似地,几何着色器456亦为图形管线的可编程阶段,且因此通过执行可编程执行单元或执行单元集区中的指令而实施。Referring now to FIG. 4, a
如上所提到,图形管线的三角形配置457阶段通常为固定功能阶段,其意谓,此阶段并不为使用者可编程的。三角形配置457阶段接受数据且对数据执行预定操作并输出结果。三角形配置457阶段的先前实施通常包括与用于图形管线400的可编程阶段的可编程执行单元(诸如,几何着色器456或顶点着色器454)分开的单独的硬件组件。根据本发明的实施例,三角形配置457阶段可实施于可编程执行单元或执行单元集区内,尽管三角形配置457阶段通常不为图形管线的使用者可编程阶段。如上所提到,三角形配置操作可包括三角形琐细排斥、行列式计算、边界框计算、精选、预属性配置KLMN、边缘函数产生、裁剪以及安全带裁剪。As mentioned above, the delta configuration 457 stage of the graphics pipeline is typically a fixed function stage, which means that this stage is not user programmable. The triangle configuration 457 stages accept data and perform predetermined operations on the data and output results. Previous implementations of the triangle configuration 457 stage typically included separate hardware components from the programmable execution units for the programmable stages of the
类似地,根据此实施例,属性配置459阶段亦可实施于可编程执行单元内,尽管属性配置459阶段通常不为图形管线400的使用者可编程阶段。属性配置操作可包括对应于在制备像素着色器中以及像素着色器操作中的像素的处理属性。根据本发明的内容,用于三角形配置457阶段以及属性配置459阶段的操作可实施于软件中而非于固定功能硬件组件中。换言之,与执行单元集区互动的软件可发出对一数据组操作的一指令组以完成三角形配置或属性配置操作。Similarly, according to this embodiment, the
根据图4,跨距与像素片产生器458为固定功能硬件组件,而非实施于可编程执行单元内的图形管线的阶段。然而,一般本领域技术人员应了解,跨距与像素片产生器或图形管线的其他阶段(包括(但不限于)未图示的再现处理管线的固定功能阶段)亦可经由在可编程执行单元中执行软件指令来实施。According to FIG. 4, stride and
现参看图5,其描绘本发明的一实施例的图形处理单元500。如上所提到,为了清楚起见,省略了图形处理单元500的某些组件;然而,一般本领域技术人员应了解,其他未描绘的硬件以及逻辑组件可存在于图形处理单元500中。图形处理单元500包括多个可编程执行单元集区506(以下称为执行单元集区506)以及执行单元集区控制与快取子系统504。执行单元集区控制与快取子系统504可控制执行单元集区506的处理核心的线程管理以及系统的使用者与图形处理单元500内的其他组件之间的通信。由执行单元集区使用的一或多个快取存储器的快取子系统亦可驻留于执行单元集区506控制与快取子系统504中。举例而言,快取子系统可被顶点着色器线程508用来储存数据以供执行三角形配置操作的随后线程使用,或用于典型的存储器传输。或者,执行单元集区506中的每一执行单元可包括执行单元缓冲器,用于由在同一执行单元内执行的随后线程使用的数据的储存。Referring now to FIG. 5, a
如上所提到,图形管线的使用者可编程阶段(诸如,几何着色器510、顶点着色器508或像素着色器512)可于执行单元集区506内执行。由于执行单元集区506通常为能够进行多线程操作的处理核心,所以执行单元集区控制与快取子系统504通常负责在执行单元集区506内的线程的排程。当执行单元集区控制与快取子系统504接收到可编程着色器的执行请求时,其将指示执行单元集区506中的执行单元建立用于着色器的执行的新线程。执行单元集区控制与快取子系统504可管理执行单元集区506上的负载,以及自一种类型的着色器至另一类型着色器的转变资源,以有效地管理图形管线的流量。此等线程管理技术是已知的且无需在本文中进行进一步详细讨论。然而,举例来说,若像素着色器512为瓶颈源(就GPU 500的流量而言),则执行单元集区控制与快取子系统504可将较多的执行单元资源配置至像素着色器512以便改善流量。As mentioned above, user-programmable stages of the graphics pipeline, such as
根据本发明的一实施例,当图形管线的执行需要三角形配置520或属性配置522操作时,可建立额外的线程以执行三角形配置或属性配置操作。相对于图3的图形处理单元(图3的三角形配置单元及属性配置单元为GPU内的单独的硬件组件),本实施例的三角形配置520以及属性配置522阶段可实现于在执行单元集区506内执行的软件中。换言之,除了执行如以上所提到的可编程着色器操作的线程外,通过在执行单元内建立能够执行三角形配置以及属性配置操作的线程,可使执行单元集区506能执行三角形配置以及属性配置操作。According to an embodiment of the present invention, when the execution of the graphics pipeline requires
执行三角形配置以及属性配置操作的软件指令可储存于执行单元自身、执行单元集区控制与快取子系统504中,且可来源于执行单元自身、执行单元集区控制与快取子系统504,或者,实施三角形配置以及属性配置操作的软件指令可来源于软件装置驱动器或应由一般本领域技术人员了解的其他位置。The software instructions for performing triangle configuration and attribute configuration operations may be stored in the execution unit itself, the execution unit pool control and
为了执行三角形配置520以及属性配置522操作,可在执行单元集区506内建立线程。三角形配置520以及属性配置522操作可执行于线程内,而非执行于与执行单元集区506分离的硬件组件内。由于执行单元集区506能够进行多线程操作,所以可建立用于执行三角形配置520以及属性配置522操作的线程,而可同时执行其他着色器操作或甚至三角形以及属性配置操作的额外线程。In order to execute
在此实施例的图形处理单元500中,跨距与像素片产生器518可实施为执行单元集区506的外部硬件组件。如所知,在完成三角形配置520操作后,可将来自三角形配置520操作的至少一些所得数据(包括边缘函数、计算的行列式、边界框以及Z差值)输出至跨距与像素片产生器518以及未图示的图形管线的可能的其他阶段(诸如,Z测试)。在完成三角形配置520操作后与跨距与像素片产生器518执行操作的期间,可中止执行三角形配置520操作的线程。在跨距与像素片产生器518或其他图形管线操作完成后,若正由图形管线操纵的几何基元被排斥,则即可终止线程。In the
换言之,若不必将几何基元再现至屏幕,诸如在几何基元由其他基元覆盖的情况下,则可能不必继续处理图形管线中的基元。如果在图形管线的此部分中未排斥几何基元,则线程可通过执行属性配置522操作而继续执行。如所知,图形管线中的属性配置522操作可包括在执行使用者可编程像素着色器512线程之前,处理对应于多个像素的多个属性,所述多个像素中的每一个包括所述多个属性的一部分。在于线程内完成属性配置522操作后,即可将所得的数据储存于执行单元集区控制与快取子系统504内的电平二快取存储器以供随后线程(包括像素着色器线程)使用。或者,可将来自线程的所得数据储存于各个执行单元内的缓冲器中,且使其可用于在执行单元内建立的下一个线程(若线程需要使用数据)。举例而言,在执行三角形配置520以及属性配置522操作的线程终止后,可在执行单元内建立一对应于由属性配置522阶段处理的像素属性的像素着色器512,其中在执行先前的线程后,像素属性以及需要用于像素着色器线程的其他数据驻留于缓冲器中。其它实施例可包括执行单元内的专门逻辑模块以增强某三角形配置或属性配置操作的效能。举例而言,可将特定逻辑电路并入于执行单元内,以执行诸如琐细三角形排斥等三角形配置阶段的操作的任务。In other words, if the geometry primitive does not have to be rendered to the screen, such as if the geometry primitive is covered by other primitives, it may not be necessary to continue processing the primitive in the graphics pipeline. If geometry primitives are not repelled in this portion of the graphics pipeline, the thread may continue executing by performing
本发明的实施例提供与结合三角形配置以及属性配置阶段的单独的硬件组件实施的图形处理单元相比的优势。具体而言,相对于实施为与执行单元集区分离的硬件组件的三角形配置单元520及/或属性配置522单元,在执行于执行单元集区内的软件指令中实施图形管线的三角形配置520以及属性配置522阶段可减少图形处理单元500的门数目。如所知,图形应用程序设计接口需要执行单元集区506以允许GPU执行图形管线的各种可编程阶段,诸如几何着色器、顶点着色器或像素着色器。在GPU内已存在的执行单元集区506内实施至少三角形配置以及属性配置阶段可移除至少所述硬件组件,进而减少系统中的门的数量。应了解,根据本发明的实施例减少图形处理单元的门数目可降低设计及/或生产GPU的成本。此外,通过去除用以将数据传递至作为单独硬件组件的三角形配置单元或属性配置单元及/或自三角形配置单元或属性配置单元传递数据的硬件线的GPU的需要,也可降低系统的成本。此在下层端(low end)图形处理单元或计算机系统中尤其有用,其中,成本为在硬件组件的设计以及制造上是重要的考虑。Embodiments of the present invention provide advantages over graphics processing units implemented in separate hardware components in conjunction with the triangle configuration and attribute configuration stages. Specifically, with respect to the
另外,本发明的实施例可导致更有效的图形管线,因为三角形配置520以及属性配置522执行于能够进行多线程操作的执行单元集区506内。应了解,可通过执行单元集区的线程控制以及排程达成图形管线的有效执行。举例而言,若三角形配置操为造成图形管线瓶颈的原因,则可自执行单元集区增加资源分配至三角形配置操作以减轻瓶颈或缓和降低的效能。或者,若图形管线的另一阶段(诸如,像素着色器)为GPU中的瓶颈的原因,则可自执行单元集区增加资源分配至像素着色器线程以增加系统的流量。此外,通过在执行单元集区506中的线程中实施属性配置以及三角形配置操作的设计可建立一较不取决于单一瓶颈点的系统。通过利用此项技术中已知的线程管理以及排程协定来管理执行单元集区506的负载,图形管线可更有效。In addition, embodiments of the present invention may result in a more efficient graphics pipeline because
本发明的实施例提供的另一优势为因消除三角形配置以及属性配置操作的独立硬件组件所产生的灵活性以及可扩展性。举例而言,本发明的实施例可通过更改用以在执行单元内执行三角形配置或属性配置操作的软件指令,来更改图形处理单元中三角形配置520或属性配置522阶段。相反,与执行单元集区分离的三角形配置以及属性配置硬件组件可能需要新的硬件组件以更改图形管线的三角形配置或属性配置阶段。对于程序错误的修改、新特征的添加或用于三角形配置520或属性配置522阶段的实施的演算法的调整,此灵活性可为有用的。Another advantage provided by embodiments of the present invention is flexibility and scalability resulting from the elimination of separate hardware components for triangle configuration and attribute configuration operations. For example, embodiments of the present invention may modify the
现参看图6,其描绘本发明的方法实施例600的流程图。在步骤602中,接收表示几何基元的顶点数据,以供图形管线的三角形配置及属性配置阶段进行处理。正由图形管线处理的几何基元的顶点数据通常自几何着色器输出,以供三角形配置阶段的处理。在步骤604中,经由软件指令在执行单元内建立线程,以执行三角形配置操作(步骤606)。如以上所提到,图形管线中的三角形配置操作可包括(但不限于):三角形琐细排斥、行列式计算、边界框计算、精选、预属性配置KLMN、边缘函数产生、裁剪以及安全带裁剪。Referring now to FIG. 6, depicted is a flowchart of a
在步骤608中,在完成三角形配置操作后,将边界框输出至跨距与像素片产生器。亦将Z差值输出至图形管线的Z测试阶段(ZL1、ZL2)。本文中未讨论链接至三角形配置阶段的输出的图形管线的其他元件,但其对于一般本领域技术人员而言是已知的。举例而言,三角形配置阶段可将数据输出至再现处理管线的其他元件以用于处理。在完成三角形配置操作且产生了至少以上输出后,中止线程直至数据返回至执行单元为止。举例而言,若线程将数据输出至跨距与像素片产生器、Z测试或再现处理管线的其他阶段,则线程在继续执行属性配置操作前,必须等待至阶段内进行的操作已完成。在步骤610中,中止线程。In
在步骤612中,若三角形或几何基元未受到跨距与像素片产生器或Z测试的排斥,则线程得以恢复(步骤614),且在步骤616中,于线程内执行属性配置操作,以产生与所述顶点数据相关的像素属性。举例而言,若图形管线的其他元件(诸如,Z测试)判定无需将三角形输出至图形管线稍后阶段中的帧缓冲器,则可排斥三角形或几何基元。在此情形下,属性配置操作是不必要的。在执行了属性配置操作后,在步骤618中,储存来自线程的数据。如以上参考图6的实施例所提到,可将来自线程的数据储存于执行单元内的缓冲器中,用于由执行单元所建立的随后线程使用。或者,亦可将数据储存于可由其他执行单元存取的快取子系统中,以供在其他执行单元中所建立的线程使用。其中,所述随后线程为选自下列的至少一个:像素着色器线程、顶点着色器线程,以及可执行所述三角形配置操作以及所述属性配置操作的一线程。在步骤620中,终止线程,且接着可将执行单元分配至专用于图形管线的其他阶段的线程。In
本发明的实施例可实施于硬件、软件、韧体或其组合中。在一些实施例中,色彩数据的压缩可由储存于存储器中且由合适的指令执行系统所执行的软件或韧体来实施。若实施于硬件中,如在替代实施例中,可通过下列已知技术中的任一个或组合来实施三角形配置以及属性配置阶段:具有用于对数据信号实施逻辑功能的逻辑门的离散逻辑电路(discrete logic circuit)、具有适当组合的逻辑门的专用集成电路(application specificintegrated circuit,ASIC)、可编程门阵列(programmable gatearray,PGA)、场可编程门阵列(field programmable gate array,FPGA)等。Embodiments of the invention may be implemented in hardware, software, firmware or a combination thereof. In some embodiments, the compression of color data may be performed by software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in an alternate embodiment, the triangle configuration and property configuration stages may be implemented by any one or combination of the following known techniques: Discrete logic circuits with logic gates for implementing logic functions on data signals (discrete logic circuit), application specific integrated circuit (ASIC), programmable gate array (programmable gate array, PGA), field programmable gate array (field programmable gate array, FPGA) etc. with appropriate combinations of logic gates.
如相当熟习本发明的技术者应理解,应将流程图中的任何过程描述或方块理解为表示模块、区段或包括用于实施过程中的具体逻辑功能或步骤的一或多个可执行指令的程序码的部分,且替代实施包括于本发明的较佳实施例的范畴内,在此范畴中,可以与所揭示或讨论的次序不同的次序来执行功能,包括大体上同时或按相反次序,此视所涉及的功能性而定。As those skilled in the present invention should understand, any process description or block in the flowchart should be understood as representing a module, section, or one or more executable instructions including specific logical functions or steps for implementing the process and alternative implementations are included within the scope of the preferred embodiment of the invention, where functions may be performed in an order different from that disclosed or discussed, including substantially simultaneously or in the reverse order , depending on the functionality involved.
以上所述仅为本发明较佳实施例,然其并非用以限定本发明的范围,任何熟悉本项技术的人员,在不脱离本发明的精神和范围内,可在此基础上做进一步的改进和变化,因此本发明的保护范围当以本申请的权利要求书所界定的范围为准。The above description is only a preferred embodiment of the present invention, but it is not intended to limit the scope of the present invention. Any person familiar with this technology can make further improvements on this basis without departing from the spirit and scope of the present invention. Improvements and changes, so the protection scope of the present invention should be defined by the claims of the present application.
附图中符号的简单说明如下:A brief description of the symbols in the drawings is as follows:
110:顶点着色器110: Vertex Shader
120:几何着色器120: Geometry Shader
130:三角形配置单元130: Triangular hive
140:跨距与像素片产生器140: Span and Pixel Slice Generator
150:属性配置单元150: Properties hive
160:像素着色器160: Pixel Shader
170:帧缓冲器170: Frame buffer
200:图形管线200: graphics pipeline
250:存储器250: memory
252:命令流处理器252: Command Stream Processor
254:顶点着色器254: Vertex shader
256:几何着色器256: Geometry Shader
257:三角形配置单元257: Triangular hive
258:跨距与像素片产生器258: Span and Pixel Slice Generator
259:属性配置单元259: Attributes Hive
260:像素着色器260: Pixel Shader
262:帧缓冲器262: frame buffer
300:图形处理单元(GPU)300: Graphics Processing Unit (GPU)
304:执行单元集区控制与快取子系统304: Execution unit pool control and cache subsystem
306:多个可编程执行单元集区306: Multiple Programmable Execution Unit Pools
310:几何着色器310: Geometry Shaders
312:像素着色器312: Pixel Shader
314:三角形配置单元314: Triangular hive
316:属性配置单元316: Attributes Hive
318:跨距与像素片产生器318: Span and Pixel Slice Generator
400:图形管线400: Graphics pipeline
450:存储器450: memory
452:命令流处理器452: Command Stream Processor
454:顶点着色器454: Vertex Shader
456:几何着色器456: Geometry Shaders
457:三角形配置457: Triangular configuration
458:跨距与像素片产生器458: Span and Pixel Slice Generator
460:像素着色器460: Pixel Shader
462:帧缓冲器462: Framebuffer
500:图形处理单元500: Graphics Processing Unit
504:执行单元集区控制与快取子系统504: Execution Unit Pool Control and Cache Subsystem
506:多个可编程执行单元集区506: Multiple Programmable Execution Unit Pools
508:顶点着色器508: Vertex Shader
510:几何着色器510: Geometry Shader
512:像素着色器512: Pixel Shader
518:跨距与像素片产生器518: Span and Pixel Slice Generator
520:三角形配置520: Triangular configuration
522:属性配置。522: attribute configuration.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2008100018156A CN101216932B (en) | 2008-01-03 | 2008-01-03 | Graphics processing device, unit, and method for executing triangle configuration and attribute configuration |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2008100018156A CN101216932B (en) | 2008-01-03 | 2008-01-03 | Graphics processing device, unit, and method for executing triangle configuration and attribute configuration |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101216932A true CN101216932A (en) | 2008-07-09 |
CN101216932B CN101216932B (en) | 2010-08-18 |
Family
ID=39623361
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2008100018156A Active CN101216932B (en) | 2008-01-03 | 2008-01-03 | Graphics processing device, unit, and method for executing triangle configuration and attribute configuration |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101216932B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102184522A (en) * | 2010-06-17 | 2011-09-14 | 威盛电子股份有限公司 | Vertex data storage method, graphic processing unit and refiner |
CN102254297A (en) * | 2010-10-15 | 2011-11-23 | 威盛电子股份有限公司 | Multi-shader system and processing method thereof |
CN101908200B (en) * | 2009-06-05 | 2012-08-08 | 财团法人资讯工业策进会 | Drawing processing system and method with power gating function |
CN104067309A (en) * | 2011-12-28 | 2014-09-24 | 英特尔公司 | Pipelined image processing sequencer |
CN109409388A (en) * | 2018-11-07 | 2019-03-01 | 安徽师范大学 | A kind of bimodulus deep learning based on graphic primitive describes sub- building method |
CN109993760A (en) * | 2017-12-29 | 2019-07-09 | 北京京东尚科信息技术有限公司 | A kind of edge detection method and device of picture |
-
2008
- 2008-01-03 CN CN2008100018156A patent/CN101216932B/en active Active
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101908200B (en) * | 2009-06-05 | 2012-08-08 | 财团法人资讯工业策进会 | Drawing processing system and method with power gating function |
CN102184522A (en) * | 2010-06-17 | 2011-09-14 | 威盛电子股份有限公司 | Vertex data storage method, graphic processing unit and refiner |
CN102254297A (en) * | 2010-10-15 | 2011-11-23 | 威盛电子股份有限公司 | Multi-shader system and processing method thereof |
CN102254297B (en) * | 2010-10-15 | 2013-11-27 | 威盛电子股份有限公司 | Multi-shader system and its processing method |
US8681162B2 (en) | 2010-10-15 | 2014-03-25 | Via Technologies, Inc. | Systems and methods for video processing |
CN104067309A (en) * | 2011-12-28 | 2014-09-24 | 英特尔公司 | Pipelined image processing sequencer |
CN109993760A (en) * | 2017-12-29 | 2019-07-09 | 北京京东尚科信息技术有限公司 | A kind of edge detection method and device of picture |
CN109409388A (en) * | 2018-11-07 | 2019-03-01 | 安徽师范大学 | A kind of bimodulus deep learning based on graphic primitive describes sub- building method |
CN109409388B (en) * | 2018-11-07 | 2021-08-27 | 安徽师范大学 | Dual-mode deep learning descriptor construction method based on graphic primitives |
Also Published As
Publication number | Publication date |
---|---|
CN101216932B (en) | 2010-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10614549B2 (en) | Varying effective resolution by screen location by changing active color sample count within multiple render targets | |
US11748840B2 (en) | Method for efficient re-rendering objects to vary viewports and under varying rendering and rasterization parameters | |
US8963930B2 (en) | Triangle setup and attribute setup integration with programmable execution unit | |
EP2671206B1 (en) | Rasterizer packet generator for use in graphics processor | |
CN111066066B (en) | Variable ratio tinting | |
US20150287231A1 (en) | Method for efficient construction of high resolution display buffers | |
US9626762B2 (en) | Stochastic rasterization using enhanced stencil operations on a graphics processing unit (GPU) | |
US11741653B2 (en) | Overlapping visibility and render passes for same frame | |
US20240257435A1 (en) | Hybrid binning | |
CN101216932A (en) | Graphics processing apparatus, unit and method for performing triangle configuration and attribute configuration | |
US11302054B2 (en) | Varying effective resolution by screen location by changing active color sample count within multiple render targets | |
US10832465B2 (en) | Use of workgroups in pixel shader | |
US11880924B2 (en) | Synchronization free cross pass binning through subpass interleaving | |
US12205193B2 (en) | Device and method of implementing subpass interleaving of tiled image rendering | |
CN117083637A (en) | Depth post-visibility collection with two-stage bin | |
TW200926047A (en) | Triangle setup and attribute setup integration with programmable execution unit |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |