CN104240285A

CN104240285A - Method for transmitting mass data of CPU to GPU through VTF technology

Info

Publication number: CN104240285A
Application number: CN201410470451.1A
Authority: CN
Inventors: 张翼
Original assignee: Of Ancient India Day Infotech Share Co Ltd In Wuxi
Current assignee: Of Ancient India Day Infotech Share Co Ltd In Wuxi
Priority date: 2014-09-16
Filing date: 2014-09-16
Publication date: 2014-12-24

Abstract

The invention discloses a method for transmitting mass data of a CPU to a GPU through a VTF technology. The method comprises the steps that a texture of the CPU is established, and data are written in the texture; the GPU reads the texture with the written-in data from the CPU, and carries out point sampling to obtain the written-in data in the texture. The mass data are stored in the texture, then the texture is transmitted to the GPU so as to enable the mass data to enter the GPU to be calculated, the data can be accessed in a vertex shader program, and the problem that the mass data can not be read is solved.

Description

VTF technical finesse CPU data in enormous quantities are utilized to import the method for GPU into

Technical field

The present invention relates to Computer Image Processing field, particularly, relate to the method that a kind of VTF of utilization technical finesse CPU data in enormous quantities import GPU into.

Background technology

Along with the development of computer technology, increasing data scale has reached TB even PB magnitude, and this just needs computing machine to have powerful floating data computing power.At present by the impact of Game Market, HD video, electronic product and Military Simulation simulation demand, in order to realize graph image effect true to nature, GPU supports the computing become increasingly complex.Be faced with figure and data calculate the problem become increasingly complex, stern challenge is proposed to the speed of computer processing data.

A current common texture picture is made up of (RGB texture) RGB passage, and each RGB numerical range is (0 ~ 255), with hexadecimal representation (0x00 ~ 0xFF), each passage takies a byte, a byte 8 bit positions, therefore a kind of color comprises RGB and occupies 24, so can give expression to plant color, computer disposal function mainly contains CPU (CPU (central processing unit)) and GPU(Graphics Processing Unit) complete, in these process, what CPU was responsible for is that the things that logicality is stronger calculates, GPU is then responsible for the high image rendering of bulk density, in GPU rendering pipeline, be divided into the different stages, the application program stage, the geometry stage, rasterization stage, because the shader programmable features of GPU, shader is commonly referred to coloring process, coloring process is divided into two classes: vertex shading program (vertex shader) and pixel shader (pixel shader) program.When processing picture, need data to import into GPU from CPU.And when a large amount of data even import into GPU vertex shading program from CPU by more data, there will be the problem that can not read these mass data.

Summary of the invention

The object of the invention is to, for the problems referred to above, propose the method that a kind of VTF of utilization technical finesse CPU data in enormous quantities import GPU into, to solve the problem that mass data can not read.

For achieving the above object, the technical solution used in the present invention is:

Utilize VTF technical finesse CPU data in enormous quantities to import a method of GPU into, comprise CPU and create texture, and the step of data write texture;

GPU reads the texture after above-mentioned write data from CPU, and carries out the step that point sampling obtains writing in texture data.

Preferably, CPU creates texture, and data write texture is specially:

Arranging texture file type is texture image;

Create dynamic texture, then toward texture write floating data.

Preferably, above-mentioned dynamic texture is 2 d texture.

Preferably, above-mentioned point sampling is specially:

For making GPU access texture corresponding to association, the hardware that computing machine is corresponding be sequence number that the texture created is arranged map one by one step;

The sampling pattern of GPU is set to the step of Clamp sampling pattern;

Import the texture after above-mentioned sequence into step that sampling function obtains data in enormous quantities.

Technical scheme of the present invention has following beneficial effect:

Technical scheme of the present invention, large batch of data are saved in texture, and then texture is passed to GPU, thus allow a large amount of data enter GPU computing, make to conduct interviews to these data in vertex shading program, solve the problem that mass data can not read.

Below by drawings and Examples, technical scheme of the present invention is described in further detail.

Accompanying drawing explanation

Fig. 1 imports the method flow diagram of GPU into for the VTF technical finesse CPU data in enormous quantities that utilize described in the embodiment of the present invention.

Embodiment

Below in conjunction with accompanying drawing, the preferred embodiments of the present invention are described, should be appreciated that preferred embodiment described herein is only for instruction and explanation of the present invention, is not intended to limit the present invention.

As shown in Figure 1, a kind of VTF of utilization technical finesse CPU data in enormous quantities import the method for GPU into, comprise CPU and create texture, and the step of data write texture;

Preferably, CPU creates texture, and data write texture is specially:

Arranging texture file type is texture image;

Create dynamic texture, then toward texture write floating data.Dynamic texture is 2 d texture.

Need in gaming to load the needs of a large amount of texture pictures for interface, and texture and picture supply the object of the pinup picture of CPU and GPU operation after being loaded into internal memory.According to the difference storing data, different textures can be defined to store the data of different purposes, but 4 texture storage can only be supported at most in existing DirectX 9.0 version.

In actual applications, create four texture pictures respectively and be used for data storage, be respectively:

D3DVERTEXTEXTURESAMPLER0、D3DVERTEXTEXTURESAMPLER1、

D3DVERTEXTEXTURESAMPLER2、D3DVERTEXTEXTURESAMPLER3

In order to make the texture can accessing establishment in GPU, need texture to be set to the discernible texture type of GPU: as follows:

IDirect3DDevice::SetTexture (D3DVERTEXTEXTURESAMPLER0, IDirect3DTexture*) // for D3DVERTEXTEXTURESAMPLER0 arranges texture (pinup picture) file type is texture image

Be generally create dynamic texture realize Lock texture then toward texture write floating data then Unlock finally texture is passed to GPU;

Texture->Lock();

For(int?i=0;i<w;i++)

For(int?j=0;j<h;j++)

Pixel [i] [j]=pdata; ///write data in 2 d texture

Texture->Unlock();

If also have other texture, the same setting, then the texture that the data write needed creates.

Preferably, above-mentioned point sampling is specially:

For making GPU access texture corresponding to association, the hardware that computing machine is corresponding is the step that the sequence number of the texture setting created maps one by one;

The sampling pattern of GPU is set to the step of Clamp sampling pattern;

Vertex texture is utilized to obtain (Vertex Texture Fetch), be called for short VTF, it is a characteristic of Shader module, texture is retrieved in the shader of summit in GPU pipeline, due to the process of GPU high-speed parallel and floating-point operation ability, import Shader into and only have concrete numerical value, data type is int(integer graphic data), float(floating data), bool(Boolean) or matrix(matrix) etc., wherein maximum matrix4 has 16 amounts, it is very huge for now these these data being imported into summit shader, then this characteristic of VTF is utilized, these data are kept in texture, and be no longer the value of pixel, by this medium, a large amount of data are allowed to enter GPU computing, make to conduct interviews to these data in the shader of summit.

In order to enable the texture in upper step can access texture corresponding to association in GPU, the sequence number that the hardware that computing machine is corresponding can be arranged for the texture created maps one by one, arranges as follows:

D3DVERTEXTEXTURESAMPLER0 associates sequence number s0

D3DVERTEXTEXTURESAMPLER1 associates sequence number s1

D3DVERTEXTEXTURESAMPLER2 associates sequence number s2

D3DVERTEXTEXTURESAMPLER3 associates sequence number s3

Then sample to the texture of association, sampling has a variety of, and that usually uses has point sampling, line sampling etc.So-called sampling is sampled to texture coordinate exactly, passes to the process that sampling function returns pixel.

In general, texture image is square, and the height and width of texture are respectively H, W, and (H, W size is n is integer), by in texture a to polygon or curved surface and when transforming to screen coordinate, the single texture pixel of texture seldom corresponds to the pixel on the image of screen, according to conversion used and texture, on screen, single pixel can correspond to sub-fraction (namely amplifying) or large quantities of texture pixel (namely reducing) of a texture pixel.In texture image, the coordinate U of each pixel, V are in [0,1] scope, and such as texture image is 512*512 size, obtain the pixel U of 48*48 position, V coordinate, that is: [48/512,48/512].

The method of line sampling: 4 pixels of namely getting around this pixel are averaged the U of this pixel, V value.

Technical solution of the present invention adopts the method for point sampling to sample to surrounding pixel, can obtain true U, the V of this pixel, the Clamp sampling pattern of concrete employing.Be specially:

IDirect3DDevice::SetSamplerState (D3DVERTEXTEXTURESAMPLER0, D3DSAMP_ADDRESSU, D3DTADDRESS_CLAMP); Arranging texture U coordinate is Clamp sampling,

IDirect3DDevice::SetSamplerState (D3DVERTEXTEXTURESAMPLER0, D3DSAMP_ADDRESSV, D3DTADDRESS_CLAMP); Arranging texture V coordinate is Clamp sampling,

IDirect3DDevice::SetSamplerState(D3DVERTEXTEXTURESAMPLER0,D3DSAMP_MINFILTER,D3DTEXF_POINT);

Wherein D3DTEXF_POINT is set to point sampling, and D3DSAMP_MINFILTER specifies the method reducing filtering,

IDirect3DDevice::SetSamplerState(D3DVERTEXTEXTURESAMPLER0,D3DSAMP_MAGFILTER,D3DTEXF_POINT);

Wherein D3DTEXF_POINT is set to point sampling, and D3DSAMP_MAGFILTER specifies the method for amplification filtering.

Sampling function tex2Dlod (g_texture is imported into by the texture image of above-mentioned association in the Shader process of GPU summit, uv) data in enormous quantities are obtained, each data are again after pixel shader, then to these data rasterization process, then by coloring process to final color value carry out calculating output to frame buffering in and be shown on screen.

In sum, technical solution of the present invention solves imports from the data in enormous quantities of CPU the problem that GPU coloring process cannot read into, has usable range wide, the advantages such as implementation method is simple.Be applied to various field more and more widely now, in game skeleton cartoon technology, in game role animation, be drive animation by the matrix of bone, then bone quantity is more, matrixing is more comparatively speaking, these data are very huge, and these data are saved in floating-point texture, and then pass to GPU process, just can draw the reposition that role is obtained by matrixing, effectively solve the not treatable problem of mass data.

When Simulated Water animation effect, water is the model that a lot of grids are formed, water belongs to liquid object again, constantly there is its shape of change change, data are many, and in CPU, be so respectively water summit position skew and normal create texture image, are each vertex side-play amount in a program, just there is N number of shift value on so N number of summit, and its normal is also in continuous change; The pixel offset value difference on each summit will produce hydrodynamic(al) and draw.In the process of a large amount of entity of process, the position of each entity, towards difference, need a large amount of data too, now in these data write texture, and then import GPU process into, complete the access to mass data.

Technique also may be used for the every field such as Military Simulation, satellite sounding, bio-science, repeats no more herein.

Last it is noted that the foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, although with reference to previous embodiment to invention has been detailed description, for a person skilled in the art, it still can be modified to the technical scheme described in foregoing embodiments, or carries out equivalent replacement to wherein portion of techniques feature.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1. utilize VTF technical finesse CPU data in enormous quantities to import a method of GPU into, it is characterized in that, comprise CPU and create texture, and the step of data write texture;

2. the VTF of utilization technical finesse CPU according to claim 1 data in enormous quantities import the method for GPU into, it is characterized in that, CPU creates texture, and data write texture is specially:

Arranging texture file type is texture image;

Create dynamic texture, then toward texture write floating data.

3. the VTF of utilization technical finesse CPU according to claim 2 data in enormous quantities import the method for GPU into, it is characterized in that, above-mentioned dynamic texture is 2 d texture.

4. the VTF technical finesse CPU data in enormous quantities that utilize according to Claims 2 or 3 import the method for GPU into, and it is characterized in that, above-mentioned point sampling is specially:

The sampling pattern of GPU is set to the step of Clamp sampling pattern;