[go: up one dir, main page]

CN117670642B - Image rendering method, graphics processor, graphics processing system, device and equipment - Google Patents

Image rendering method, graphics processor, graphics processing system, device and equipment Download PDF

Info

Publication number
CN117670642B
CN117670642B CN202211088150.3A CN202211088150A CN117670642B CN 117670642 B CN117670642 B CN 117670642B CN 202211088150 A CN202211088150 A CN 202211088150A CN 117670642 B CN117670642 B CN 117670642B
Authority
CN
China
Prior art keywords
state information
texture
information
video memory
sampling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211088150.3A
Other languages
Chinese (zh)
Other versions
CN117670642A (en
Inventor
姜莹
王海洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiangdixian Computing Technology Chongqing Co ltd
Original Assignee
Xiangdixian Computing Technology Chongqing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiangdixian Computing Technology Chongqing Co ltd filed Critical Xiangdixian Computing Technology Chongqing Co ltd
Priority to CN202211088150.3A priority Critical patent/CN117670642B/en
Publication of CN117670642A publication Critical patent/CN117670642A/en
Application granted granted Critical
Publication of CN117670642B publication Critical patent/CN117670642B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation
    • G06T15/205Image-based rendering
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Graphics (AREA)
  • Computing Systems (AREA)
  • Geometry (AREA)
  • Image Generation (AREA)

Abstract

本公开提供一种图像渲染方法、图形处理器、图形处理系统、装置及设备,旨在避免缺页错误和减少显存空间的浪费。其中图像渲染方法包括:GPU接收CPU提交的绘图命令;在绘图命令对应的状态信息表示不需要执行纹理采样的情况下,对待渲染对象执行几何处理和属性插值计算以确定每个片元的多个属性信息,并将每个片元的多个属性信息存储至显存,以及向CPU发送反馈信息,使CPU根据纹理索引信息将相应的纹理数据加载至显存;其中,多个属性信息包括纹理索引信息;在绘图命令对应的状态信息表示需要执行纹理采样的情况下,从显存中读取纹理数据和每个片元的多个属性信息,并将读取的纹理数据和每个片元的多个属性信息输入片元着色器进行处理。

The present disclosure provides an image rendering method, a graphics processor, a graphics processing system, a device and an apparatus, which are intended to avoid page faults and reduce the waste of video memory space. The image rendering method includes: a GPU receives a drawing command submitted by a CPU; when the status information corresponding to the drawing command indicates that texture sampling does not need to be performed, geometric processing and attribute interpolation calculation are performed on the object to be rendered to determine multiple attribute information of each fragment, and the multiple attribute information of each fragment is stored in the video memory, and feedback information is sent to the CPU so that the CPU loads the corresponding texture data into the video memory according to the texture index information; wherein the multiple attribute information includes texture index information; when the status information corresponding to the drawing command indicates that texture sampling needs to be performed, the texture data and the multiple attribute information of each fragment are read from the video memory, and the read texture data and the multiple attribute information of each fragment are input into the fragment shader for processing.

Description

Image rendering method, graphics processor, graphics processing system, device and equipment
Technical Field
The present disclosure relates to the field of image rendering technologies, and in particular, to an image rendering method, a graphics processor, a graphics processing system, a device, and equipment.
Background
Texture sampling is an important link in the image rendering process, and because of the large data size of texture data, it is generally difficult to load all the texture data into the video memory. In the related art, in order to load texture data actually required by the GPU into the video memory as much as possible, an application running on the CPU predicts the camera position and loads texture data corresponding to an object that may need to be rendered into the video memory according to the predicted camera position.
However, if the camera position predicted by the application program is inaccurate, on the one hand, the GPU may sample texture data that is not loaded into the memory when performing texture sampling, thereby generating a page fault. On the other hand, in the rendering process, the GPU needs to perform coordinate conversion on the vertex of the object to be rendered, and the application program cannot know the result after the coordinate conversion of the GPU, so that the application program needs to load texture data possibly used by the GPU into the video memory, and a certain waste of the video memory space is caused.
Disclosure of Invention
The purpose of the present disclosure is to provide an image rendering method, a graphics processor, a graphics processing system, a device and equipment, which aim to avoid page fault errors and reduce waste of video memory space.
According to one aspect of the present disclosure, there is provided an image rendering method, applied to a GPU, the method including:
receiving a drawing command submitted by a CPU (Central processing Unit), and acquiring state information corresponding to the drawing command;
Under the condition that the state information indicates that texture sampling is not required to be executed, geometric processing and attribute interpolation calculation are executed on an object to be rendered so as to determine a plurality of attribute information of each fragment, the attribute information of each fragment is stored into a video memory, feedback information is sent to a CPU, and the CPU loads corresponding texture data into the video memory according to texture index information included in the attribute information; wherein the plurality of attribute information includes texture index information;
In the case where the state information indicates that texture sampling needs to be performed, the texture data and the plurality of attribute information of each tile are read from the video memory, and the read texture data and the plurality of attribute information of each tile are input to the tile shader for processing.
In one possible implementation of the present disclosure, the state information includes a sampling identifier, where the sampling identifier is used to indicate whether texture sampling needs to be performed; after the state information corresponding to the drawing command is acquired, the method further comprises the following steps:
And judging whether texture sampling is required to be executed or not according to the sampling identification in the state information.
In one possible implementation of the present disclosure, before sending the feedback information to the CPU, the method further includes:
Writing the storage addresses of the attribute information of each fragment in the video memory into state information;
and storing the state information written into the storage address into the video memory according to the target address carried by the drawing command.
In one possible implementation of the present disclosure, reading texture data and a plurality of attribute information of each fragment from a video memory includes:
Reading texture data from the video memory according to the texture video memory address in the state information; wherein, the texture video memory address is written into the state information by the CPU according to the address of the texture data in the video memory;
And reading a plurality of attribute information of each fragment from the video memory according to the attribute information storage address in the state information.
In one possible implementation manner of the present disclosure, the number of state information corresponding to a drawing command is multiple, each state information corresponds to one object to be rendered, each state information includes a sampling identifier, and each sampling identifier included in each state information is used for characterizing whether texture sampling needs to be performed on the corresponding object to be rendered;
After the state information corresponding to the drawing command is acquired, the method further comprises the following steps:
Determining whether texture sampling is required to be executed on the corresponding object to be rendered according to the sampling identification in each state information;
In a case where the state information indicates that texture sampling does not need to be performed, performing geometric processing and attribute interpolation computation on an object to be rendered to determine a plurality of attribute information of each fragment, comprising:
in the case where the drawing command corresponds to state information indicating that texture sampling is not required to be performed, geometric processing and attribute interpolation computation are performed for an object to be rendered corresponding to the state information to determine a plurality of attribute information for each fragment.
In one possible implementation of the present disclosure, in a case where the state information indicates that texture sampling needs to be performed, reading texture data and a plurality of attribute information of each tile from a video memory includes:
Under the condition that the drawing command corresponds to the state information which indicates that the texture sampling needs to be executed, the texture data corresponding to the object to be rendered and the attribute information of each fragment are read from the video memory aiming at the object to be rendered corresponding to the state information.
In one possible implementation of the present disclosure, in a case where the state information indicates that texture sampling does not need to be performed, performing geometric processing and attribute interpolation computation on an object to be rendered to determine a plurality of attribute information of each fragment, and storing the plurality of attribute information of each fragment to a display memory, including:
performing geometric processing and attribute interpolation computation on an object to be rendered to determine a plurality of attribute information of each fragment in the case where the state information indicates that texture sampling is not required to be performed;
performing depth detection on the plurality of fragments so as to reserve fragments which are not shielded;
And storing the attribute information of the non-occluded fragments into a display memory.
According to another aspect of the present disclosure, there is provided a graphic processor including:
the command receiving module is used for receiving drawing commands submitted by the CPU;
The state information acquisition module is used for acquiring state information corresponding to the drawing command;
The geometric processing module is used for executing geometric processing on the object to be rendered under the condition that the state information indicates that texture sampling is not required to be executed;
The interpolation calculation module is used for executing attribute interpolation calculation on the object to be rendered to determine a plurality of attribute information of each fragment under the condition that the state information indicates that texture sampling is not required to be executed, storing the attribute information of each fragment into a video memory, and sending feedback information to the CPU so that the CPU loads corresponding texture data into the video memory according to the texture index information; wherein the plurality of attribute information includes texture index information;
and the texture sampler is used for reading the texture data and the attribute information of each fragment from the video memory and inputting the read texture data and the attribute information of each fragment into the fragment shader for processing under the condition that the state information indicates that the texture sampling is required to be performed.
In one possible implementation of the present disclosure, the state information includes a sampling identifier, where the sampling identifier is used to characterize whether texture sampling needs to be performed, and the texture sampler is further used to determine whether texture sampling needs to be performed according to the sampling identifier in the state information.
In one possible implementation of the present disclosure, before sending feedback information to the CPU, the interpolation calculation module is further configured to write the storage addresses of the plurality of attribute information of each primitive in the video memory into the state information, and store the state information written into the storage addresses into the video memory according to the target address carried by the drawing command.
In one possible implementation of the present disclosure, when the texture sampler reads texture data and a plurality of attribute information of each fragment from the video memory, the texture sampler is specifically configured to:
Reading texture data from the video memory according to the texture video memory address in the state information; wherein, the texture video memory address is written into the state information by the CPU according to the address of the texture data in the video memory;
And reading a plurality of attribute information of each fragment from the video memory according to the attribute information storage address in the state information.
In one possible implementation manner of the present disclosure, the number of state information corresponding to a drawing command is multiple, each state information corresponds to one object to be rendered, each state information includes a sampling identifier, and each sampling identifier included in each state information is used for characterizing whether texture sampling needs to be performed on the corresponding object to be rendered;
the texture sampler is further used for determining whether texture sampling is required to be executed on the corresponding object to be rendered according to the sampling identification in each state information;
The geometric processing module is specifically configured to execute geometric processing for an object to be rendered corresponding to state information when the state information indicating that texture sampling is not required to be executed is corresponding to a drawing command when the geometric processing is executed for the object to be rendered;
The interpolation calculation module is specifically configured to perform attribute interpolation calculation for an object to be rendered corresponding to state information when the drawing command corresponds to the state information indicating that texture sampling is not required to be performed when the attribute interpolation calculation is performed for the object to be rendered.
In one possible implementation manner of the present disclosure, when the texture sampler reads texture data and a plurality of attribute information of each fragment from the video memory, the texture sampler is specifically configured to, when the drawing command corresponds to state information indicating that texture sampling needs to be performed, read, for an object to be rendered corresponding to the state information, the texture data corresponding to the object to be rendered and the plurality of attribute information of each fragment from the video memory.
In one possible implementation of the present disclosure, the graphics processor further includes:
The depth detection module is used for carrying out depth detection on the plurality of fragments so as to keep the fragments which are not shielded;
the interpolation calculation module is specifically configured to store the attribute information of the non-occluded fragment into the video memory when storing the attribute information of each fragment into the video memory.
According to another aspect of the present disclosure, there is also provided a graphics processing system including the graphics processor of any of the above embodiments.
According to another aspect of the present disclosure, there is also provided an electronic device including the graphics processing system described above. In some use cases, the product form of the electronic device is embodied as a graphics card; in other use scenarios, the product form of the electronic device is embodied as a CPU motherboard.
According to another aspect of the present disclosure, there is also provided an electronic apparatus including the above-described electronic device. In some use scenarios, the product form of the electronic device is a portable electronic device, such as a smart phone, a tablet computer, a VR device, etc.; in some use cases, the electronic device is in the form of a personal computer, a game console, or the like.
According to another aspect of the present disclosure, there is also provided an image rendering method including:
Submitting a drawing command to the GPU, wherein the drawing command corresponds to first state information used for representing that texture sampling is not required to be performed, enabling the GPU to perform geometric processing and attribute interpolation calculation according to an object to be rendered so as to determine a plurality of attribute information of each fragment, and storing the attribute information of each fragment into a display memory; wherein the plurality of attribute information includes texture index information;
Responding to feedback information of the GPU, and loading corresponding texture data to a video memory according to the texture index information;
And submitting a drawing command again to the GPU, wherein the drawing command corresponds to second state information for representing that texture sampling needs to be executed, so that the GPU reads texture data and a plurality of attribute information of each fragment from a video memory, and inputs the read texture data and the plurality of attribute information of each fragment into a fragment shader for processing.
In one possible implementation of the present disclosure, before submitting the drawing command to the GPU, the method further includes:
The first state information is configured, the first state information containing a first sample identification, the first sample identification being used to indicate that texture sampling need not be performed.
In one possible implementation of the present disclosure, before resubmitting the drawing command to the GPU, the method further includes:
replacing a first sampling identifier in the first state information with a second sampling identifier to obtain second state information, wherein the second sampling identifier is used for indicating that texture sampling needs to be performed; or configuring second state information, wherein the second state information comprises a second sampling identifier, and the second sampling identifier is used for indicating that texture sampling needs to be performed.
In one possible implementation of the present disclosure, responding to feedback information of a GPU, loading corresponding texture data to a video memory according to texture index information, includes:
Responding to the feedback information of the GPU, reading a plurality of attribute information of each fragment from the video memory according to the attribute information storage address recorded in the first state information, and loading corresponding texture data to the video memory according to the texture index information in the attribute information.
In one possible implementation of the present disclosure, before resubmitting the drawing command to the GPU, the method further includes:
And writing the address of the texture data in the video memory into the second state information.
In one possible implementation manner of the present disclosure, each drawing command submitted to the GPU corresponds to a plurality of state information, where the plurality of state information includes first state information and second state information, the first state information includes a first sample identifier for indicating that texture sampling is not required to be performed, the second state information includes a second sample identifier for indicating that texture sampling is required to be performed, and each state information corresponds to an object to be rendered respectively;
Before each submission of a drawing command to the GPU, the method further comprises:
replacing a first sampling identifier in the first state information corresponding to the previous drawing command with a second sampling identifier to obtain second state information;
configuring first state information for a new object to be rendered, wherein a sampling identifier in the first state information is a first sampling identifier;
and taking the obtained second state information and the configured first state information as a plurality of state information corresponding to the current drawing command.
Drawings
FIG. 1 is a flow chart of an image rendering method according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of an image rendering method according to an embodiment of the present disclosure in a first stage;
FIG. 3 is a flow chart of an image rendering method at a second stage according to an embodiment of the disclosure;
FIG. 4 is a schematic diagram of a graphics processor provided in an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of an image rendering method according to an embodiment of the present disclosure;
Fig. 6 is a schematic diagram of an image rendering method according to an embodiment of the present disclosure.
Detailed Description
Before describing embodiments of the present disclosure, it should be noted that: some embodiments of the disclosure are described as process flows, in which the various operational steps of the flows may be numbered sequentially, but may be performed in parallel, concurrently, or simultaneously.
The terms "first," "second," and the like may be used in embodiments of the present disclosure to describe various features, but these features should not be limited by these terms. These terms are only used to distinguish one feature from another.
The term "and/or," "and/or" may be used in embodiments of the present disclosure to include any and all combinations of one or more of the associated features listed.
It will be understood that when two elements are described in a connected or communicating relationship, unless a direct connection or direct communication between the two elements is explicitly stated, connection or communication between the two elements may be understood as direct connection or communication, as well as indirect connection or communication via intermediate elements.
In order to make the technical solutions and advantages of the embodiments of the present disclosure more apparent, the following detailed description of exemplary embodiments of the present disclosure is provided in conjunction with the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present disclosure, not all embodiments of which are exhaustive. It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other.
In the related art, in order to load texture data actually required by the GPU into the video memory as much as possible, an application running on the CPU predicts the camera position and loads texture data corresponding to an object that may need to be rendered into the video memory according to the predicted camera position. However, if the camera position predicted by the application program is inaccurate, on the one hand, the GPU may sample texture data that is not loaded into the memory when performing texture sampling, thereby generating a page fault. On the other hand, in the rendering process, the GPU needs to perform coordinate conversion on the vertex of the object to be rendered, and the application program cannot know the result after the coordinate conversion of the GPU, so that the application program needs to load texture data possibly used by the GPU into the video memory, and a certain waste of the video memory space is caused.
The purpose of the present disclosure is to provide an image rendering method, a graphics processor, a graphics processing system, a device and equipment, which avoid page fault and reduce waste of video memory space.
Referring to fig. 1, fig. 1 is a flowchart of an image rendering method according to an embodiment of the disclosure, where the method is applied to a GPU. As shown in fig. 1, the method comprises the steps of:
s110: and receiving a drawing command submitted by the CPU, and acquiring state information corresponding to the drawing command.
The drawing command may specifically be drawcall commands. In some embodiments, the state information is CPU configured, and the CPU may send the state information or an address of the state information to the GPU while submitting drawcall commands to the GPU. The GPU may directly obtain the state information carried by the drawcall command, or may read the corresponding state information from the video memory according to the address carried by the drawcall command.
S120: under the condition that the state information indicates that texture sampling is not required to be executed, geometric processing and attribute interpolation calculation are executed on an object to be rendered so as to determine a plurality of attribute information of each fragment, the attribute information of each fragment is stored into a video memory, feedback information is sent to a CPU, and the CPU loads corresponding texture data into the video memory according to texture index information included in the attribute information; wherein the plurality of attribute information includes texture index information.
In some embodiments, the GPU may perform geometric processing on objects to be rendered, including but not limited to vertex coordinate conversion, clipping operations, backface culling, primitive assembly, and the like, through programmable shaders and/or custom hardware. The GPU may perform attribute interpolation calculations on each primitive of the primitives via custom hardware (e.g., an interpolation calculation module) to determine a plurality of attribute information for each primitive.
The interpolation calculation module calculates corresponding attribute information of each primitive by interpolation according to attribute information of primitive vertexes aiming at each predefined attribute. For example, the predefined attribute includes a depth value, texture index information, and a normal line, and the interpolation calculation module calculates a corresponding depth value for each primitive interpolation in the primitive according to the depth value of the primitive vertex, calculates corresponding texture index information for each primitive interpolation in the primitive according to the texture index information of the primitive vertex, and calculates a corresponding normal line for each primitive interpolation in the primitive according to the normal line of the primitive vertex.
In some embodiments, after the GPU stores the plurality of attribute information of each tile in the video memory, feedback information may be sent to the CPU by generating an interrupt, so that the CPU may load corresponding texture data into the video memory according to texture index information in the plurality of attribute information of each tile. The texture index information may be a texture virtual address, a block number of a texture block (tile), or a texture coordinate, and the specific type of the texture index information is not limited in the present disclosure.
In the method, the GPU cuts the part which is not required to be displayed in the object to be rendered through geometric processing, and eliminates the hidden surface of the object to be rendered, so that the primitive obtained after geometric processing is more likely to be the primitive which is required to be displayed. The GPU calculates a plurality of attribute information of the more likely displayed fragments by carrying out pixel attribute interpolation calculation on each fragment in the primitives. Then, the GPU sends feedback information to the CPU, so that the CPU can load corresponding texture data into the video memory according to the texture index information in the plurality of attribute information of each fragment. Therefore, the CPU loads texture data corresponding to the more likely displayed fragments to the video memory, on one hand, the waste of the video memory space is reduced, and on the other hand, the CPU loads the texture data according to the attribute information of the fragments determined by the GPU instead of the prediction of the CPU on the camera, so that the situation that the GPU samples the texture data which is not loaded to the video memory when performing texture sampling can be avoided, namely page missing errors are avoided.
S130: in the case where the state information indicates that texture sampling needs to be performed, the texture data and the plurality of attribute information of each tile are read from the video memory, and the read texture data and the plurality of attribute information of each tile are input to the tile shader for processing.
In some embodiments, in the case where the state information indicates that texture sampling needs to be performed, geometry processing and attribute interpolation computation are directly skipped (i.e., bypass) for the GPU, the texture data and the plurality of attribute information of each tile are directly read from the video memory, and the read texture data and the plurality of attribute information of each tile are input to the tile shader for processing, so that the color information of each tile is determined according to the texture data and the plurality of attribute information of each tile by the tile shader.
In the present disclosure, rendering is divided into two phases, and in the first phase, state information corresponding to a drawing command indicates that texture sampling is not required to be performed, and a GPU performs geometric processing and attribute interpolation computation on an object to be rendered to determine a plurality of attribute information of each primitive, thereby calculating a plurality of attribute information of primitives more likely to be displayed. Then, the GPU sends feedback information to the CPU, so that the CPU can load corresponding texture data into the video memory according to the texture index information in the plurality of attribute information of each fragment. Therefore, the waste of the display memory space can be reduced, and page fault can be avoided. In the second stage, the state information corresponding to the drawing command indicates that texture sampling needs to be performed, and because the GPU has performed interpolation calculation on the plurality of attributes of each primitive in the first stage, in the second stage, the GPU only needs to read the texture data and the plurality of attribute information corresponding to each primitive from the video memory and send the texture data and the plurality of attribute information to the primitive shader for processing, and does not need to perform geometric processing and attribute interpolation calculation again, so that the flow of the GPU in the second stage can be simplified.
In some embodiments, the state information includes a sample flag that indicates whether texture sampling needs to be performed. After the GPU acquires the state information corresponding to the drawing command, judging whether texture sampling is required to be executed according to the sampling identification in the state information. Wherein, if the texture sampling is not required, the GPU performs the above step S120, and if the texture sampling is required, the GPU performs the above step S130.
In specific implementation, the CPU may pre-configure state information, and store the state information in the display memory, where the state information includes a first sample identifier or a second sample identifier, where the first sample identifier indicates that texture sampling is not required, and the second sample identifier indicates that texture sampling is required. The drawing command submitted by the CPU to the GPU carries the storage address of the state information in the video memory. After the GPU receives the drawing command, the GPU can read the state information from the video memory according to the storage address carried by the drawing command, and judge whether texture sampling is required to be executed according to the sampling identification in the state information.
In some embodiments, after the GPU interpolates and calculates the attribute information of each primitive and stores the attribute information of each primitive into the video memory, the GPU may write the storage address of the attribute information of each primitive in the video memory into the state information, and store the state information written into the storage address into the video memory according to the target address carried by the drawing command. Thus, after receiving the feedback information, the CPU can read out new state information from the video memory according to the target address, inquire the storage address of the attribute information in the video memory from the new state information, and further read out a plurality of attribute information of each fragment from the video memory according to the storage address. The CPU may load corresponding texture data to the video memory according to texture index information among the plurality of attribute information.
In some embodiments, when the GPU reads texture data and the plurality of attribute information of each fragment from the video memory, the GPU may specifically read the texture data from the video memory according to the texture video memory address in the state information; and reading a plurality of attribute information of each fragment from the video memory according to the attribute information storage address in the state information. Wherein, the texture video memory address is written into the state information by the CPU according to the address of the texture data in the video memory.
When the CPU receives feedback information of the GPU, new state information is read from the video memory according to the target address, and a plurality of attribute information of each fragment is read from the video memory according to the address of the attribute information recorded in the new state information in the video memory. And the CPU loads corresponding texture data to the video memory according to the texture index information in the attribute information of each fragment. The CPU takes the address of the texture data in the video memory as the texture video memory address, and writes the texture data into the state information. Or the CPU configures new state information and writes the texture video memory address into the new state information. And the CPU submits a drawing command to the GPU again, wherein the state information corresponding to the drawing command comprises a texture video memory address, and the state information comprises the second sampling identifier.
After the GPU receives the drawing command, according to the second sampling identification contained in the state information, the GPU determines that texture sampling needs to be executed. In this case, the GPU reads texture data from the memory according to the texture memory address in the state information. And the GPU reads a plurality of attribute information of each fragment from the video memory according to the address of the attribute information recorded in the state information in the video memory. The GPU submits the read texture data and attribute information to the fragment shader, so that the fragment shader determines the color information of each fragment according to the texture data and the attribute information.
Referring to fig. 2, fig. 2 is a flowchart of an image rendering method according to an embodiment of the disclosure in a first stage. As shown in fig. 2, the flow of the first stage is as follows:
S201: an application program running in the CPU allocates a virtual address of the video memory for the texture data, and temporarily does not allocate a physical address of the video memory, namely the application program does not load the texture data to the video memory; the application program configures the state information and writes the state information into the video memory.
Wherein, the status information of the first stage at least comprises the following information:
1) The address of the texture data in the video memory is a virtual address because the texture is not loaded into the video memory in the first stage.
2) A sample flag, such as a first sample flag or a second sample flag, for indicating whether texture sampling is performed, is used for the GPU to identify whether texture data is to be read from the video memory. If the sample identifier is the first sample identifier, the GPU does not read texture data from the video memory, i.e., does not sample texture. If the sample identifier is the second sample identifier, the GPU reads texture data from the memory, i.e., performs texture sampling. In the first stage, the sample identifier in the state information is the first sample identifier, which indicates that texture sampling is not performed.
3) The video memory address of the attribute information is empty because the GPU has not performed attribute interpolation calculation when the state information is configured by the application program, i.e., the video memory address of the attribute information is not recorded in the state information.
S202: when the application program submits a drawing command to the GPU, the address of the state information in the video memory is sent to the GPU.
S203: after the GPU receives the drawing command, the state information is read from the video memory according to the address of the state information in the video memory, and texture sampling is not required at this stage according to the first sampling identification contained in the state information.
S204: the GPU determines a plurality of attribute information of each fragment by performing geometric processing and attribute interpolation calculation on an object to be rendered, and stores the plurality of attribute information of each fragment to a display memory.
S205: the GPU writes the address of the attribute information in the video memory (namely the video memory address of the attribute information) into the state information so as to update the state information, and writes the updated state information into the video memory according to the target address provided by the application program. Wherein the target address may or may not be equal to the status information address carried by the drawing command. If the target address is equal to the status information address carried by the drawing command, the updated status information is stored in the memory at the location of the original status information. If the target address is not equal to the state information address carried by the drawing command, the updated state information is not stored in the position of the original state information in the video memory, but is stored in the new position corresponding to the target address.
S206: and the GPU sends feedback information to the CPU through an interrupt mechanism.
S207: and reading updated state information from the video memory according to the target address by an application program running in the CPU.
S208: the application program reads the attribute information of each fragment from the video memory according to the video memory address of the attribute information recorded in the updated state information, and loads corresponding texture data to the video memory according to the texture index information in the attribute information.
S209: the application program records the address of the texture data in the video memory (namely the video memory physical address of the texture data) into the state information, replaces the first sampling identification in the state information with the second sampling identification, and stores the latest state information into the video memory.
Referring to fig. 3, fig. 3 is a flow chart of an image rendering method in a second stage according to an embodiment of the disclosure. As shown in fig. 3, the flow of the second stage is as follows:
s301: when the application program submits a drawing command to the GPU, the address of the latest state information in the video memory is sent to the GPU.
S302: after the GPU receives the drawing command, the state information is read from the video memory according to the address of the state information in the video memory, and the texture sampling is required at the stage according to the second sampling identification contained in the state information.
S303: the GPU skips the steps of geometric processing, attribute interpolation calculation and the like, and directly calls a texture sampler, and the texture sampler reads texture data from a video memory according to the address of the texture data recorded in the state information in the video memory. Or the texture sampler reads the attribute information of each fragment from the video memory according to the attribute information storage address recorded in the state information, and reads the texture data from the video memory according to the texture index information in the attribute information.
S304: the texture sampler submits the read texture data and the attribute information of each fragment to a fragment shader running in the GPU, so that the fragment shader determines the color information of each fragment according to the texture data and the attribute information of each fragment.
In some embodiments, when performing pixel attribute interpolation computation for each primitive, the GPU first interpolates the texture index information of each primitive according to the texture index information of the primitive vertices. After the GPU calculates the texture index information of each fragment, the GPU stores the texture index information of each fragment to a first address in a video memory, records the first address to state information, and then sends feedback information to the CPU. After the GPU sends the feedback information, performing attribute interpolation calculation on other attributes (such as a depth value, a normal line, a tangent line) of each fragment, and storing the other attributes calculated by the interpolation calculation to a second address in the video memory. After receiving the feedback information, the CPU reads out the texture index information of each fragment from the video memory according to the first address recorded in the state information, and loads corresponding texture data to the video memory according to the texture index information. After the CPU loads the texture data, the CPU submits the drawing command to the GPU again. In this embodiment, the GPU performs interpolation calculation on the remaining attribute information while the CPU loads texture data, which is beneficial to improving the image rendering efficiency.
In some embodiments, the drawing command includes a plurality of status information, each status information corresponds to one object to be rendered, each status information includes a sampling identifier, and each status information includes a sampling identifier for indicating whether texture sampling needs to be performed on the corresponding object to be rendered. After the GPU receives the drawing command submitted by the CPU, the GPU may further determine, according to the sampling identifier in each state information, whether texture sampling needs to be performed on the corresponding object to be rendered.
Wherein the object to be rendered may be a model and a primitive to be rendered. The model to be rendered refers to a model to be displayed under the current view angle, taking a game rendering scene as an example, and assuming that a game main angle, an opponent character, a referee character and an arena are required to be rendered under the current view angle, the game main angle, the opponent character, the referee character and the arena are the model to be rendered.
For example, in some cases, each frame of image includes a plurality of models, the plurality of models are sequentially rendered according to a preset rendering order, and each drawing command corresponds to state information of a second stage of the previously rendered model and state information of a first stage of the currently rendered model. For ease of understanding, assume that the current frame picture includes A, B, C models. The first drawing command submitted by the CPU corresponds to the state information of the model A in the first stage, and the GPU determines that texture sampling is not required to be performed on the model A according to the corresponding state information after receiving the first drawing command. The second drawing command submitted by the CPU corresponds to the state information of the model A in the second stage and the state information of the model B in the first stage, and after the GPU receives the second drawing command, the GPU determines that texture sampling is required to be performed on the model A according to the corresponding state information of the second drawing command, and the texture sampling is not required to be performed on the model B. The third drawing command submitted by the CPU corresponds to the state information of the model B in the second stage and the state information of the model C in the first stage, and after the GPU receives the third drawing command, the GPU determines that texture sampling is required to be performed on the model B according to the corresponding state information of the third drawing command, and the texture sampling is not required to be performed on the model C. The fourth drawing command submitted by the CPU corresponds to the state information of the model C in the second stage, and the GPU determines that texture sampling is required to be executed on the model C according to the corresponding state information after receiving the fourth drawing command.
Or, for example, in other cases, one drawing command is used to render a plurality of primitives PRIMITIVE, each of which corresponds to a respective one of the state information. Each state information contains a sample identity, and each state information contains a sample identity used to characterize whether texture sampling needs to be performed on the corresponding primitive.
In the case that the state information corresponding to the drawing command is plural, when the GPU executes the above step S120, specifically, in the case that the drawing command corresponds to the state information indicating that the texture sampling is not required, geometric processing and attribute interpolation computation may be performed on the object to be rendered corresponding to the state information, so as to determine plural attribute information of each fragment.
When the GPU executes the above step S130, if the state information corresponding to the drawing command is plural, and if the state information corresponding to the drawing command indicates that texture sampling needs to be performed, the GPU reads, from the video memory, the texture data corresponding to the object to be rendered and the plural attribute information of each fragment corresponding to the object to be rendered, for the object to be rendered corresponding to the state information.
In some embodiments, the GPU may specifically perform the following sub-steps when performing step S120:
S120-1: in the case where the state information indicates that texture sampling does not need to be performed, geometric processing and attribute interpolation computation are performed on the object to be rendered to determine a plurality of attribute information for each fragment.
S120-2: and performing depth detection on the plurality of fragments to reserve the fragments which are not shielded.
S120-3: and storing the attribute information of the non-occluded fragments into a display memory.
In the substep S120-2, depth search may be performed on the plurality of primitives based on Early-z depth test method, so as to filter out the occluded primitives, and only the non-occluded primitives remain. In this embodiment, before the CPU loads the texture data, the depth detection is performed on the primitives to filter the blocked primitives, so that the CPU will not load the texture data corresponding to the blocked primitives to the video memory, thereby further reducing the waste of the video memory space.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a graphics processor according to an embodiment of the present disclosure. The graphics processor shown in fig. 4 and the image rendering method shown in fig. 1 are based on the same inventive concept, and in order to avoid repetition, the graphics processor is briefly described below. For a particular implementation of the graphics processor, reference may be made to a corresponding particular implementation of the image rendering method. As shown in fig. 4, the graphic processor includes:
the command receiving module 410 is configured to receive a drawing command submitted by the CPU.
The status information obtaining module 420 is configured to obtain status information corresponding to the drawing command.
The geometry processing module 430 is configured to perform geometry processing on the object to be rendered in a case where the state information indicates that texture sampling is not required to be performed.
The interpolation calculation module 440 is configured to perform attribute interpolation calculation on the object to be rendered to determine a plurality of attribute information of each fragment when the state information indicates that texture sampling is not required, store the plurality of attribute information of each fragment to a display memory, and send feedback information to the CPU, so that the CPU loads corresponding texture data to the display memory according to the texture index information; wherein the plurality of attribute information includes texture index information.
And a texture sampler 450 for reading texture data and the attribute information of each tile from the video memory and inputting the read texture data and the attribute information of each tile into the tile shader for processing, in case the state information indicates that the texture sampling needs to be performed.
Optionally, the state information includes a sampling identifier, where the sampling identifier is used to characterize whether texture sampling needs to be performed, and the texture sampler is further used to determine whether texture sampling needs to be performed according to the sampling identifier in the state information.
Optionally, before sending the feedback information to the CPU, the interpolation calculation module is further configured to write the storage addresses of the plurality of attribute information of each primitive in the video memory into the state information, and store the state information written into the storage addresses into the video memory according to the target address carried by the drawing command.
Optionally, the texture sampler is specifically configured to, when reading texture data and a plurality of attribute information of each tile from the video memory: reading texture data from the video memory according to the texture video memory address in the state information; wherein, the texture video memory address is written into the state information by the CPU according to the address of the texture data in the video memory; and reading a plurality of attribute information of each fragment from the video memory according to the attribute information storage address in the state information.
Optionally, the state information corresponding to the drawing command is multiple, each state information corresponds to one object to be rendered respectively, each state information contains a sampling identifier, and the sampling identifier contained in each state information is used for representing whether texture sampling needs to be performed on the corresponding object to be rendered.
The texture sampler is further configured to determine, according to the sampling identifier in each state information, whether texture sampling needs to be performed on the corresponding object to be rendered.
The geometry processing module is specifically configured to execute geometry processing for an object to be rendered corresponding to state information when the state information indicating that texture sampling is not required to be executed is corresponding to the drawing command when executing geometry processing for the object to be rendered.
The interpolation calculation module is specifically configured to perform attribute interpolation calculation for an object to be rendered corresponding to state information when the drawing command corresponds to the state information indicating that texture sampling is not required to be performed when the attribute interpolation calculation is performed for the object to be rendered.
Optionally, when the texture sampler reads the texture data and the plurality of attribute information of each fragment from the video memory, the texture sampler is specifically configured to, when the drawing command corresponds to the state information indicating that the texture sampling needs to be performed, read, for an object to be rendered corresponding to the state information, the texture data corresponding to the object to be rendered and the plurality of attribute information of each fragment from the video memory.
The graphics processor further includes: and the depth detection module is used for carrying out depth detection on the plurality of fragments so as to reserve the fragments which are not shielded.
The interpolation calculation module is specifically configured to store the attribute information of the non-occluded fragment into the video memory when storing the attribute information of each fragment into the video memory.
Based on the same inventive concept, the embodiments of the present disclosure also provide a graphics processing System, which may be a die, a SOC (System on Chip) with multiple die interconnections, or other organization forms.
The architecture and the working principle of the graphics processing system provided in the present disclosure are described below by taking one die as an example.
In one embodiment shown in FIG. 5, a single die graphics processing system includes multiple GPU cores (i.e., the graphics processor described in any of the embodiments above).
Each GPU core is used for processing drawing commands, and executing Pipeline of image rendering according to the drawing commands and also can be used for executing other operation commands; the multiple GPU cores as a whole perform drawing or other computing tasks. Each GPU core further includes: the computing unit is used for executing instructions compiled by the shader, belongs to a programmable module and consists of a large number of ALUs; a Cache (Cache) for caching GPU-kernel data to reduce access to memory; a rasterization module, a fixed stage of the 3D rendering pipeline; a dicing (Tilling) module, configured to dice a frame in the TBR and TBDR GPU architecture; the geometric processing module is used for carrying out coordinate conversion on vertex data at a fixed stage of the 3D rendering pipeline and cutting out primitives which are out of an observation range or are not displayed on the back; the texture sampler is used for reading texture data from the video memory and sending the read texture data to the fragment shader for processing; the post-processing module is used for performing operations such as zooming, cutting, rotating and the like on the drawn graph; microcores (microcores) for scheduling between various pipeline hardware modules on a GPU core, or for task scheduling for multiple GPU cores.
As shown in fig. 5, the graphics processing system may further include:
The network on chip is used for data exchange among all IP cores on the graphics processing system;
A universal DMA (Direct Memory Access ) for performing data movement between the host side to graphics processing system memory (e.g., graphics card memory), e.g., moving vertex (vertex) data of a 3D drawing from the host side to graphics processing system memory via DMA;
The PCIe controller is used for realizing PCIe protocol through the interface communicated with the host, so that the graphics processing system is connected to the host through the PCIe interface, and programs such as a graphics API, a driver of a display card and the like are run on the host;
The application processor is used for scheduling tasks of each module on the graphic processing system, for example, the GPU is notified to the application processor after rendering a frame of image, and the application processor is restarted to display the image drawn by the GPU on a screen by the display controller;
the memory controller is used for connecting a system memory and storing data on the SOC;
A display controller for controlling the frame buffer in the system memory to be output to the display by a display interface (HDMI, DP, etc.);
Video decoding, which can decode the coded video on the host hard disk into pictures capable of being displayed;
The original video code stream on the hard disk of the host can be coded into a specified format and returned to the host.
Based on the graphics processing system shown in fig. 5, in one embodiment, the GPU receives a drawing command submitted by the CPU based on the PCIe interface, and the GPU obtains state information corresponding to the drawing command from the video memory. In the case that the state information indicates that texture sampling is not required, the geometry processing module performs geometry processing on the object to be rendered, including vertex coordinate conversion, model clipping, back surface culling, primitive assembly, and the like. Then, for each primitive, performing attribute difference calculation on each primitive in the primitives by an interpolation calculation module in the rasterization module, wherein the attribute of each primitive comprises: depth values, texture imperative information, normals, tangents, etc. The interpolation calculation module stores the attribute information of each fragment into a video memory, and records the attribute information video memory address into the state information. The GPU gives feedback to the CPU based on an interrupt mechanism, so that the CPU obtains the attribute information video memory address of each fragment from the state information, reads texture index information from the video memory according to the attribute information video memory address, and loads texture data to be sampled to the video memory according to the texture index information. The CPU also writes the texture video memory address of the texture data into the state information, and then the CPU sends the drawing command to the GPU again.
The GPU receives a drawing command submitted by the CPU based on the PCIe interface, and acquires state information corresponding to the drawing command from the video memory. In the case that the state information indicates that texture sampling is required, the geometry processing module does not perform geometry processing operation, the interpolation calculation module does not perform attribute information interpolation calculation operation, texture data is directly read from a video memory by a texture sampler according to a texture video memory address in the state information, and the texture sampler also reads a plurality of attribute information of each fragment from the video memory according to an attribute information storage address in the state information. The texture sampler sends the read texture data and attribute information to a fragment shader for processing, thereby determining the color of each fragment.
It should be noted that, the implementation of the graphics processing system provided in the embodiments of the present disclosure is described above by taking the specific structure shown in fig. 5 as an example. In practical applications, the specific implementation of the graphics processor that may be included in the graphics processing system may refer to any of the foregoing embodiments, which are not described herein again.
The embodiment of the disclosure also provides an electronic device, which comprises the graphics processing system. In some use cases, the product form of the electronic device is embodied as a graphics card; in other use scenarios, the product form of the electronic device is embodied as a CPU motherboard.
The embodiment of the disclosure also provides electronic equipment, which comprises the electronic device. In some use scenarios, the product form of the electronic device is a portable electronic device, such as a smart phone, a tablet computer, a VR device, etc.; in some use cases, the electronic device is in the form of a personal computer, game console, workstation, server, etc.
Referring to fig. 6, fig. 6 is a schematic diagram of an image rendering method provided by an embodiment of the present disclosure, which may be performed by a CPU or an application running on the CPU. The image rendering method shown in fig. 6 is based on the same inventive concept as the image rendering method shown in fig. 1, and the image rendering method shown in fig. 6 is briefly described below in order to avoid repetition. For a specific implementation of the image rendering method, reference may be made to a corresponding specific implementation of the image rendering method in fig. 1. As shown in fig. 6, the image rendering method includes:
s610: submitting a drawing command to the GPU, wherein the drawing command corresponds to first state information used for representing that texture sampling is not required to be performed, enabling the GPU to perform geometric processing and attribute interpolation calculation according to an object to be rendered so as to determine a plurality of attribute information of each fragment, and storing the attribute information of each fragment into a display memory; wherein the plurality of attribute information includes texture index information.
S620: responding to the feedback information of the GPU, and loading corresponding texture data to the video memory according to the texture index information.
S630: and submitting a drawing command again to the GPU, wherein the drawing command corresponds to second state information for representing that texture sampling needs to be executed, so that the GPU reads texture data and a plurality of attribute information of each fragment from a video memory, and inputs the read texture data and the plurality of attribute information of each fragment into a fragment shader for processing.
Optionally, before submitting the drawing command to the GPU, the method further comprises: the first state information is configured, the first state information containing a first sample identification, the first sample identification being used to indicate that texture sampling need not be performed.
Optionally, before resubmitting the drawing command to the GPU, the method further comprises: replacing a first sampling identifier in the first state information with a second sampling identifier to obtain second state information, wherein the second sampling identifier is used for indicating that texture sampling needs to be performed; or configuring second state information, wherein the second state information comprises a second sampling identifier, and the second sampling identifier is used for indicating that texture sampling needs to be performed.
Optionally, when executing step S620, specifically, in response to the feedback information of the GPU, multiple attribute information of each primitive may be read from the video memory according to the attribute information storage address recorded in the first state information, and corresponding texture data may be loaded into the video memory according to the texture index information in the multiple attribute information.
Optionally, before resubmitting the drawing command to the GPU, the method further comprises: and writing the address of the texture data in the video memory into the second state information.
Optionally, each drawing command submitted to the GPU corresponds to a plurality of state information, where the plurality of state information includes first state information and second state information, the first state information includes a first sample identifier for indicating that texture sampling is not required, the second state information includes a second sample identifier for indicating that texture sampling is required, and each state information corresponds to an object to be rendered.
Before each submission of a drawing command to the GPU, the method further comprises: replacing a first sampling identifier in the first state information corresponding to the previous drawing command with a second sampling identifier to obtain second state information; configuring first state information for a new object to be rendered, wherein a sampling identifier in the first state information is a first sampling identifier; and taking the obtained second state information and the configured first state information as a plurality of state information corresponding to the current drawing command.
While the preferred embodiments of the present disclosure have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the disclosure.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present disclosure without departing from the spirit or scope of the disclosure. Thus, the present disclosure is intended to include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (23)

1.一种图像渲染方法,应用于GPU,所述方法包括:1. An image rendering method, applied to a GPU, the method comprising: 接收CPU提交的绘图命令,获取所述绘图命令对应的状态信息;Receive a drawing command submitted by the CPU, and obtain status information corresponding to the drawing command; 在所述状态信息表示不需要执行纹理采样的情况下,对待渲染对象执行几何处理和属性插值计算以确定每个片元的多个属性信息,并将每个片元的多个属性信息存储至显存,以及向CPU发送反馈信息,使CPU根据属性信息中包括的纹理索引信息将相应的纹理数据加载至所述显存;其中,所述多个属性信息包括纹理索引信息;When the state information indicates that texture sampling does not need to be performed, geometric processing and attribute interpolation calculation are performed on the object to be rendered to determine multiple attribute information of each fragment, and the multiple attribute information of each fragment is stored in the video memory, and feedback information is sent to the CPU so that the CPU loads corresponding texture data into the video memory according to texture index information included in the attribute information; wherein the multiple attribute information includes texture index information; 在所述状态信息表示需要执行纹理采样的情况下,从所述显存中读取纹理数据和每个片元的多个属性信息,并将读取的纹理数据和每个片元的多个属性信息输入片元着色器进行处理。When the state information indicates that texture sampling needs to be performed, texture data and multiple attribute information of each fragment are read from the video memory, and the read texture data and multiple attribute information of each fragment are input into the fragment shader for processing. 2.根据权利要求1所述的方法,所述状态信息包含采样标识,所述采样标识用于表示是否需要执行纹理采样;在获取所述绘图命令对应的状态信息后,所述方法还包括:2. The method according to claim 1, wherein the state information includes a sampling flag, and the sampling flag is used to indicate whether texture sampling needs to be performed; after obtaining the state information corresponding to the drawing command, the method further includes: 根据所述状态信息中的采样标识,判断是否需要执行纹理采样。Whether texture sampling needs to be performed is determined according to the sampling identifier in the state information. 3.根据权利要求1或2所述的方法,在向CPU发送反馈信息之前,所述方法还包括:3. The method according to claim 1 or 2, before sending the feedback information to the CPU, the method further comprises: 将每个片元的多个属性信息在所述显存中的存储地址写入所述状态信息;Writing a plurality of attribute information of each fragment into the storage address in the video memory into the state information; 根据所述绘图命令携带的目标地址,将写入存储地址后的状态信息存入所述显存。According to the target address carried by the drawing command, the state information after being written into the storage address is stored in the video memory. 4.根据权利要求3所述的方法,所述从所述显存中读取纹理数据和每个片元的多个属性信息,包括:4. The method according to claim 3, wherein the reading of texture data and a plurality of attribute information of each fragment from the video memory comprises: 根据所述状态信息中的纹理显存地址,从所述显存中读取纹理数据;其中,所述纹理显存地址是CPU根据纹理数据在显存中的地址写入所述状态信息中的;Reading texture data from the video memory according to the texture video memory address in the state information; wherein the texture video memory address is written into the state information by the CPU according to the address of the texture data in the video memory; 根据所述状态信息中的属性信息存储地址,从所述显存中读取每个片元的多个属性信息。According to the attribute information storage address in the state information, multiple attribute information of each fragment is read from the video memory. 5.根据权利要求1所述的方法,所述绘图命令对应的状态信息为多个,每个状态信息分别对应一个待渲染对象,每个状态信息包含采样标识,每个状态信息包含的采样标识用于表征是否需要对相应待渲染对象执行纹理采样;5. The method according to claim 1, wherein the drawing command corresponds to a plurality of pieces of state information, each piece of state information corresponds to an object to be rendered, each piece of state information includes a sampling identifier, and the sampling identifier included in each piece of state information is used to indicate whether texture sampling needs to be performed on the corresponding object to be rendered; 在获取所述绘图命令对应的状态信息后,所述方法还包括:After acquiring the status information corresponding to the drawing command, the method further includes: 根据每个状态信息中的采样标识,确定是否需要对相应待渲染对象执行纹理采样;Determine whether texture sampling needs to be performed on the corresponding object to be rendered according to the sampling identifier in each state information; 所述在所述状态信息表示不需要执行纹理采样的情况下,对待渲染对象执行几何处理和属性插值计算以确定每个片元的多个属性信息,包括:When the state information indicates that texture sampling does not need to be performed, geometric processing and attribute interpolation calculation are performed on the object to be rendered to determine multiple attribute information of each fragment, including: 在所述绘图命令对应有表示不需要执行纹理采样的状态信息的情况下,针对该状态信息对应的待渲染对象执行几何处理和属性插值计算,以确定每个片元的多个属性信息。In the case where the drawing command corresponds to state information indicating that texture sampling does not need to be performed, geometry processing and attribute interpolation calculation are performed on the to-be-rendered object corresponding to the state information to determine a plurality of attribute information of each fragment. 6.根据权利要求5所述的方法,所述在所述状态信息表示需要执行纹理采样的情况下,从所述显存中读取纹理数据和每个片元的多个属性信息,包括:6. The method according to claim 5, wherein when the state information indicates that texture sampling needs to be performed, reading texture data and a plurality of attribute information of each fragment from the video memory comprises: 在所述绘图命令对应有表示需要执行纹理采样的状态信息的情况下,针对该状态信息对应的待渲染对象,从所述显存中读取该待渲染对象对应的纹理数据和每个片元的多个属性信息。When the drawing command corresponds to status information indicating that texture sampling needs to be performed, for the object to be rendered corresponding to the status information, texture data corresponding to the object to be rendered and multiple attribute information of each fragment are read from the video memory. 7.根据权利要求1所述的方法,所述在所述状态信息表示不需要执行纹理采样的情况下,对待渲染对象执行几何处理和属性插值计算以确定每个片元的多个属性信息,并将每个片元的多个属性信息存储至显存,包括:7. The method according to claim 1, wherein when the state information indicates that texture sampling does not need to be performed, performing geometry processing and attribute interpolation calculation on the object to be rendered to determine a plurality of attribute information of each fragment, and storing the plurality of attribute information of each fragment in a video memory, comprises: 在所述状态信息表示不需要执行纹理采样的情况下,对待渲染对象执行几何处理和属性插值计算以确定每个片元的多个属性信息;When the state information indicates that texture sampling does not need to be performed, performing geometric processing and attribute interpolation calculation on the object to be rendered to determine a plurality of attribute information of each fragment; 对多个片元进行深度检测,以保留未被遮挡的片元;Perform depth detection on multiple fragments to retain unobstructed fragments; 将未被遮挡的片元的多个属性信息存储至所述显存。The plurality of attribute information of the unobstructed fragments is stored in the video memory. 8.一种图形处理器,包括:8. A graphics processor, comprising: 命令接收模块,用于接收CPU提交的绘图命令;A command receiving module is used to receive drawing commands submitted by the CPU; 状态信息获取模块,用于获取所述绘图命令对应的状态信息;A status information acquisition module, used to acquire status information corresponding to the drawing command; 几何处理模块,用于在所述状态信息表示不需要执行纹理采样的情况下,对待渲染对象执行几何处理;A geometry processing module, configured to perform geometry processing on the object to be rendered when the state information indicates that texture sampling does not need to be performed; 插值计算模块,用于在所述状态信息表示不需要执行纹理采样的情况下,对所述待渲染对象执行属性插值计算以确定每个片元的多个属性信息,并将每个片元的多个属性信息存储至显存,以及向CPU发送反馈信息,使CPU根据所述纹理索引信息将相应的纹理数据加载至所述显存;其中,所述多个属性信息包括纹理索引信息;an interpolation calculation module, configured to, when the state information indicates that texture sampling does not need to be performed, perform attribute interpolation calculation on the object to be rendered to determine a plurality of attribute information of each fragment, store the plurality of attribute information of each fragment into a video memory, and send feedback information to a CPU so that the CPU loads corresponding texture data into the video memory according to the texture index information; wherein the plurality of attribute information includes texture index information; 纹理采样器,用于在所述状态信息表示需要执行纹理采样的情况下,从所述显存中读取纹理数据和每个片元的多个属性信息,并将读取的纹理数据和每个片元的多个属性信息输入片元着色器进行处理。A texture sampler is used to read texture data and multiple attribute information of each fragment from the video memory when the state information indicates that texture sampling needs to be performed, and input the read texture data and multiple attribute information of each fragment into the fragment shader for processing. 9.根据权利要求8所述的图形处理器,所述状态信息包含采样标识,所述采样标识用于表征是否需要执行纹理采样,所述纹理采样器还用于根据所述状态信息中的采样标识,判断是否需要执行纹理采样。9. The graphics processor according to claim 8, wherein the state information includes a sampling flag, the sampling flag is used to indicate whether texture sampling needs to be performed, and the texture sampler is further used to determine whether texture sampling needs to be performed according to the sampling flag in the state information. 10.根据权利要求8或9所述的图形处理器,所述插值计算模块在向CPU发送反馈信息之前,还用于将每个片元的多个属性信息在所述显存中的存储地址写入所述状态信息,根据所述绘图命令携带的目标地址,将写入存储地址后的状态信息存入所述显存。10. According to the graphics processor of claim 8 or 9, before sending feedback information to the CPU, the interpolation calculation module is also used to write multiple attribute information of each fragment into the storage address in the video memory into the status information, and store the status information after writing the storage address into the video memory according to the target address carried by the drawing command. 11.根据权利要求10所述的图形处理器,所述纹理采样器在从所述显存中读取纹理数据和每个片元的多个属性信息时,具体用于:11. The graphics processor according to claim 10, wherein when the texture sampler reads the texture data and the plurality of attribute information of each fragment from the video memory, it is specifically used to: 根据所述状态信息中的纹理显存地址,从所述显存中读取纹理数据;其中,所述纹理显存地址是CPU根据纹理数据在显存中的地址写入所述状态信息中的;Reading texture data from the video memory according to the texture video memory address in the state information; wherein the texture video memory address is written into the state information by the CPU according to the address of the texture data in the video memory; 根据所述状态信息中的属性信息存储地址,从所述显存中读取每个片元的多个属性信息。According to the attribute information storage address in the state information, multiple attribute information of each fragment is read from the video memory. 12.根据权利要求8所述的图形处理器,所述绘图命令对应的状态信息为多个,每个状态信息分别对应一个待渲染对象,每个状态信息包含采样标识,每个状态信息包含的采样标识用于表征是否需要对相应待渲染对象执行纹理采样;12. The graphics processor according to claim 8, wherein the drawing command corresponds to a plurality of pieces of state information, each piece of state information corresponds to an object to be rendered, each piece of state information includes a sampling identifier, and the sampling identifier included in each piece of state information is used to indicate whether texture sampling needs to be performed on the corresponding object to be rendered; 所述纹理采样器还用于根据每个状态信息中的采样标识,确定是否需要对相应待渲染对象执行纹理采样;The texture sampler is also used to determine whether it is necessary to perform texture sampling on the corresponding object to be rendered according to the sampling identifier in each state information; 所述几何处理模块在对待渲染对象执行几何处理时,具体用于在所述绘图命令对应有表示不需要执行纹理采样的状态信息的情况下,针对该状态信息对应的待渲染对象执行几何处理;When the geometry processing module performs geometry processing on the object to be rendered, the geometry processing module is specifically used to perform geometry processing on the object to be rendered corresponding to the state information when the drawing command corresponds to state information indicating that texture sampling does not need to be performed; 所述插值计算模块在对所述待渲染对象执行属性插值计算时,具体用于在所述绘图命令对应有表示不需要执行纹理采样的状态信息的情况下,针对该状态信息对应的待渲染对象执行属性插值计算。When the interpolation calculation module performs the attribute interpolation calculation on the object to be rendered, the interpolation calculation module is specifically used to perform the attribute interpolation calculation on the object to be rendered corresponding to the state information when the drawing command corresponds to the state information indicating that texture sampling does not need to be performed. 13.根据权利要求12所述的图形处理器,所述纹理采样器在从所述显存中读取纹理数据和每个片元的多个属性信息时,具体用于在所述绘图命令对应有表示需要执行纹理采样的状态信息的情况下,针对该状态信息对应的待渲染对象,从所述显存中读取该待渲染对象对应的纹理数据和每个片元的多个属性信息。13. According to the graphics processor of claim 12, when the texture sampler reads the texture data and multiple attribute information of each fragment from the video memory, it is specifically used to read the texture data and multiple attribute information of each fragment corresponding to the object to be rendered and corresponding to the state information from the video memory when the drawing command corresponds to state information indicating that texture sampling needs to be performed. 14.根据权利要求8所述的图形处理器,还包括:14. The graphics processor according to claim 8, further comprising: 深度检测模块,用于对多个片元进行深度检测,以保留未被遮挡的片元;A depth detection module, used to perform depth detection on multiple fragments to retain unobstructed fragments; 所述插值计算模块在将每个片元的多个属性信息存储至显存时,具体用于将未被遮挡的片元的多个属性信息存储至所述显存。When storing multiple attribute information of each fragment into the video memory, the interpolation calculation module is specifically used to store multiple attribute information of unobstructed fragments into the video memory. 15.一种图形处理系统,包括权利要求8-14任一项所述的图形处理器。15. A graphics processing system, comprising the graphics processor according to any one of claims 8 to 14. 16.一种电子装置,包括权利要求15所述的图形处理系统。16. An electronic device comprising the graphics processing system according to claim 15. 17.一种电子设备,包括权利要求16所述的电子装置。17. An electronic device comprising the electronic device according to claim 16. 18.一种图像渲染方法,所述方法包括:18. An image rendering method, the method comprising: 向GPU提交绘图命令,该绘图命令对应有用于表示不需要执行纹理采样的第一状态信息,使所述GPU根据对待渲染对象执行几何处理和属性插值计算以确定每个片元的多个属性信息,并将每个片元的多个属性信息存储至显存;其中,所述多个属性信息包括纹理索引信息;Submitting a drawing command to the GPU, the drawing command corresponding to first state information for indicating that texture sampling does not need to be performed, so that the GPU performs geometric processing and attribute interpolation calculation on the object to be rendered to determine multiple attribute information of each fragment, and stores the multiple attribute information of each fragment in a video memory; wherein the multiple attribute information includes texture index information; 响应于所述GPU的反馈信息,根据所述纹理索引信息,将相应的纹理数据加载至所述显存;In response to feedback information from the GPU, and according to the texture index information, loading corresponding texture data into the video memory; 向所述GPU再次提交绘图命令,该绘图命令对应有用于表示需要执行纹理采样的第二状态信息,使所述GPU从所述显存中读取所述纹理数据和每个片元的多个属性信息,并将读取的纹理数据和每个片元的多个属性信息输入片元着色器进行处理。Submitting a drawing command to the GPU again, the drawing command corresponding to second status information for indicating that texture sampling needs to be performed, so that the GPU reads the texture data and multiple attribute information of each fragment from the video memory, and inputs the read texture data and multiple attribute information of each fragment into a fragment shader for processing. 19.根据权利要求18所述的方法,在向GPU提交绘图命令之前,所述方法还包括:19. The method according to claim 18, before submitting the drawing command to the GPU, the method further comprises: 配置所述第一状态信息,所述第一状态信息包含第一采样标识,所述第一采样标识用于表示不需要执行纹理采样。The first state information is configured, where the first state information includes a first sampling identifier, where the first sampling identifier is used to indicate that texture sampling does not need to be performed. 20.根据权利要求19所述的方法,在向所述GPU再次提交绘图命令之前,所述方法还包括:20. The method according to claim 19, before submitting the drawing command to the GPU again, the method further comprises: 将所述第一状态信息中的第一采样标识替换为第二采样标识,以获得所述第二状态信息,所述第二采样标识用于表示需要执行纹理采样;或者,配置所述第二状态信息,所述第二状态信息包含第二采样标识,所述第二采样标识用于表示需要执行纹理采样。The first sampling identifier in the first state information is replaced with a second sampling identifier to obtain the second state information, wherein the second sampling identifier is used to indicate that texture sampling needs to be performed; or the second state information is configured, wherein the second state information includes a second sampling identifier, and the second sampling identifier is used to indicate that texture sampling needs to be performed. 21.根据权利要求18所述的方法,所述响应于所述GPU的反馈信息,根据所述纹理索引信息,将相应的纹理数据加载至所述显存,包括:21. The method according to claim 18, wherein in response to the feedback information from the GPU, and according to the texture index information, loading the corresponding texture data into the video memory comprises: 响应于所述GPU的反馈信息,根据第一状态信息中记录的属性信息存储地址,从显存中读取每个片元的多个属性信息,并根据多个属性信息中的纹理索引信息,将相应的纹理数据加载至所述显存。In response to the feedback information of the GPU, multiple attribute information of each fragment is read from the video memory according to the attribute information storage address recorded in the first state information, and corresponding texture data is loaded into the video memory according to the texture index information in the multiple attribute information. 22.根据权利要求21所述的方法,在向所述GPU再次提交绘图命令之前,所述方法还包括:22. The method according to claim 21, before submitting the drawing command to the GPU again, the method further comprises: 将纹理数据在显存中的地址写入所述第二状态信息。The address of the texture data in the video memory is written into the second state information. 23.根据权利要求18所述的方法,每次向GPU提交的绘图命令对应多个状态信息,所述多个状态信息中包括第一状态信息和第二状态信息,第一状态信息包含用于表示不需要执行纹理采样的第一采样标识,第二状态信息包含用于表示需要执行纹理采样的第二采样标识,每个状态信息分别对应一个待渲染对象;23. The method according to claim 18, wherein each drawing command submitted to the GPU corresponds to a plurality of state information, wherein the plurality of state information includes first state information and second state information, the first state information includes a first sampling flag for indicating that texture sampling does not need to be performed, and the second state information includes a second sampling flag for indicating that texture sampling needs to be performed, and each state information corresponds to an object to be rendered; 在每次向GPU提交绘图命令之前,所述方法还包括:Before submitting a drawing command to the GPU each time, the method further includes: 将上一个绘图命令对应的第一状态信息中的第一采样标识替换为第二采样标识,以获得第二状态信息;Replacing the first sampling identifier in the first state information corresponding to the last drawing command with the second sampling identifier to obtain the second state information; 为新的待渲染对象配置所述第一状态信息,所述第一状态信息中的采样标识为第一采样标识;Configuring the first state information for a new object to be rendered, wherein the sampling identifier in the first state information is a first sampling identifier; 将获得的第二状态信息和配置的第一状态信息作为当前绘图命令对应的多个状态信息。The obtained second state information and the configured first state information are used as a plurality of state information corresponding to the current drawing command.
CN202211088150.3A 2022-09-07 2022-09-07 Image rendering method, graphics processor, graphics processing system, device and equipment Active CN117670642B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211088150.3A CN117670642B (en) 2022-09-07 2022-09-07 Image rendering method, graphics processor, graphics processing system, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211088150.3A CN117670642B (en) 2022-09-07 2022-09-07 Image rendering method, graphics processor, graphics processing system, device and equipment

Publications (2)

Publication Number Publication Date
CN117670642A CN117670642A (en) 2024-03-08
CN117670642B true CN117670642B (en) 2024-11-19

Family

ID=90079543

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211088150.3A Active CN117670642B (en) 2022-09-07 2022-09-07 Image rendering method, graphics processor, graphics processing system, device and equipment

Country Status (1)

Country Link
CN (1) CN117670642B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106296782A (en) * 2015-05-29 2017-01-04 Tcl集团股份有限公司 A kind of word rendering intent and word rendering device
CN107657648A (en) * 2017-09-30 2018-02-02 广州悦世界信息科技有限公司 The colouring method and system of real-time high-efficiency in a kind of moving game

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6333743B1 (en) * 1997-10-23 2001-12-25 Silicon Graphics, Inc. Method and apparatus for providing image and graphics processing using a graphics rendering engine
JP4266233B2 (en) * 2007-03-28 2009-05-20 株式会社東芝 Texture processing device
CN105094920B (en) * 2015-08-14 2018-07-03 网易(杭州)网络有限公司 A kind of game rendering intent and device
KR102651126B1 (en) * 2016-11-28 2024-03-26 삼성전자주식회사 Graphic processing apparatus and method for processing texture in graphics pipeline
CN110415161B (en) * 2019-07-19 2023-06-27 龙芯中科(合肥)技术有限公司 Graphics processing method, device, equipment and storage medium
GB2590748B (en) * 2020-06-30 2022-02-02 Imagination Tech Ltd Method and graphics processing system for rendering one or more fragments having shader-dependent properties
CN112785676B (en) * 2021-02-08 2024-04-12 腾讯科技(深圳)有限公司 Image rendering method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106296782A (en) * 2015-05-29 2017-01-04 Tcl集团股份有限公司 A kind of word rendering intent and word rendering device
CN107657648A (en) * 2017-09-30 2018-02-02 广州悦世界信息科技有限公司 The colouring method and system of real-time high-efficiency in a kind of moving game

Also Published As

Publication number Publication date
CN117670642A (en) 2024-03-08

Similar Documents

Publication Publication Date Title
JP4938850B2 (en) Graphic processing unit with extended vertex cache
CN109564695B (en) Apparatus and method for efficient 3D graphics pipeline
EP2438576B1 (en) Displaying a visual representation of performance metrics for rendered graphics elements
US8547382B2 (en) Video graphics system and method of pixel data compression
US11908039B2 (en) Graphics rendering method and apparatus, and computer-readable storage medium
US7804499B1 (en) Variable performance rasterization with constant effort
US7876328B2 (en) Managing multiple contexts in a decentralized graphics processing unit
US10825129B2 (en) Eliminating off screen passes using memoryless render target
US10198789B2 (en) Out-of-order cache returns
KR20170088687A (en) Computing system and method for performing graphics pipeline of tile-based rendering thereof
CN113838180B (en) A rendering instruction processing method and related device
CN117670642B (en) Image rendering method, graphics processor, graphics processing system, device and equipment
KR101286938B1 (en) Partitioning-based performance analysis for graphics imaging
CN116385253B (en) Primitive drawing method, device, computer equipment and storage medium
WO2023202367A1 (en) Graphics processing unit, system, apparatus, device, and method
CN116263982B (en) Graphics processors, systems, methods, electronic devices and equipment
US11790479B2 (en) Primitive assembly and vertex shading of vertex attributes in graphics processing systems
CN116957898B (en) Graphics processor, system, method, electronic device and electronic equipment
CN116957899B (en) Graphics processor, system, device, equipment and method
CN116263981B (en) Graphics processor, system, apparatus, device, and method
CN116860782B (en) Graphics processor, system, apparatus, device, and method
CN118466821A (en) Data processing method, module, graphic processor and electronic equipment
WO2023202366A1 (en) Graphics processing unit and system, electronic apparatus and device, and graphics processing method
CN117132445A (en) Graphics processor, method and electronic equipment
CN118799471A (en) A material attribute storage method, device and equipment for BIM lightweight engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant