3. The Graphics Processing Unit (4/5)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

44 3. The Graphics Processing Unit

tems [31]. The ability to access gradient information is a unique capability

of the pixel shader, not shared by any of the other programmable shader

stages.

Pixel shader programs typically set the fragment color for merging in

the ﬁnal merging stage. The depth value generated in the rasterization

stage can also be modiﬁed by the pixel shader. The stencil buﬀer value is

not modiﬁable, but rather is passed through to the merge stage. In SM 2.0

and on, a pixel shader can also discard incoming fragment data, i.e., gen-

erate no output. Such operations can cost performance, as optimizations

normally performed by the GPU cannot then be used. See Section 18.3.7

for details. Operations such as fog computation and alpha testing have

moved from being merge operations to being pixel shader computations in

SM 4.0 [123].

Current pixel shaders are capable of doing a huge amount of processing.

The ability to compute any number of values in a single rendering pass gave

rise to the idea of multiple render targets (MRT). Instead of saving results

of a pixel shader’s program to a single color buﬀer, multiple vectors could

be generated for each fragment and saved to diﬀerent buﬀers. These buﬀers

must be the same dimensions, and some architectures require them each to

have the same bit depth (though with diﬀerent formats, as needed). The

number of PS output registers in Table 3.1 refers to the number of separate

buﬀers accessible, i.e., 4 or 8. Unlike the displayable color buﬀer, there are

other limitations on any additional targets. For example, typically no anti-

aliasing can be performed. Even with these limitations, MRT functionality

is a powerful aid in performing rendering algorithms more eﬃciently. If a

number of intermediate results images are to be computed from the same

set of data, only a single rendering pass is needed, instead of one pass per

output buﬀer. The other key capability associated with MRTs is the ability

to read from these resulting images as textures.

3.7 The Merging Stage

As discussed in Section 2.4.4, the merging stage is where the depths and

colors of the individual fragments (generated in the pixel shader) are com-

bined with the frame buﬀer. This stage is where stencil-buﬀer and Z-buﬀer

operations occur. Another operation that takes place in this stage is color

blending, which is most commonly used for transparency and compositing

operations (see Section 5.7).

The merging stage occupies an interesting middle point between the

ﬁxed-function stages, such as clipping, and the fully programmable shader

stages. Although it is not programmable, its operation is highly conﬁg-

urable. Color blending in particular can be set up to perform a large

3.8. Effects 45

number of diﬀerent operations. The most common are combinations of

multiplication, addition, and subtraction involving the color and alpha val-

ues, but other operations are possible, such as minimum and maximum,

as well as bitwise logic operations. DirectX 10 added the capability to

blend two colors from the pixel shader with the frame buﬀer color—this

capability is called dual-color blending.

If MRT functionality is employed, then blending can be performed on

multiple buﬀers. DirectX 10.1 introduced the capability to perform dif-

ferent blend operations on each MRT buﬀer. In previous versions, the

same blending operation was always performed on all buﬀers (note that

dual-color blending is incompatible with MRT).

3.8 Effects

This tour of the pipeline has focused so far on the various programmable

stages. While vertex, geometry, and pixel shader programs are necessary

to control these stages, they do not exist in a vacuum. First, an individual

shader program is not particularly useful in isolation: A vertex shader pro-

gram feeds its results to a pixel shader. Both programs must be loaded for

any work to be done. The programmer must perform some matching of the

outputs of the vertex shader to the inputs of the pixel shader. A particu-

lar rendering eﬀect may be produced by any number of shader programs

executed over a few passes. Beyond the shader programs themselves, state

variables must sometimes be set in a particular conﬁguration for these pro-

grams to work properly. For example, the renderer’s state includes whether

and how the Z-buﬀer and stencil buﬀer are each used, and how a fragment

aﬀects the existing pixel value (e.g., replace, add, or blend).

For these reasons, various groups have developed eﬀects languages, such

as HLSL FX, CgFX, and COLLADA FX. An eﬀect ﬁle attempts to encap-

sulate all the relevant information needed to execute a particular rendering

algorithm [261, 974]. It typically deﬁnes some global arguments that can

be assigned by the application. For example, a single eﬀect ﬁle might deﬁne

the vertex and pixel shaders needed to render a convincing plastic material.

It would expose arguments such as the plastic color and roughness so that

these could be changed for each model rendered, but using the same eﬀect

ﬁle.

To show the ﬂavor of an eﬀect ﬁle, we will walk through a trimmed-

down example taken from NVIDIA’s FX Composer 2 eﬀects system. This

DirectX 9 HLSL eﬀect ﬁle implements a very simpliﬁed form of Gooch

shading [423]. One part of Gooch shading is to use the surface normal and

compare it to the light’s location. If the normal points toward the light, a

warm tone is used to color the surface; if it points away, a cool tone is used.

46 3. The Graphics Processing Unit

Figure 3.8. Gooch shading, varying from a warm orange to a cool blue. (Image produced

by FX Composer 2, courtesy of NVIDIA Corporation.)

Angles in between interpolate between these two user-deﬁned colors. This

shading technique is a form of non-photorealistic rendering, the subject of

Chapter 11. An example of this eﬀect in action is shown in Figure 3.8.

Eﬀect variables are deﬁned at the beginning of the eﬀect ﬁle. The

ﬁrst few variables are “untweakables,” parameters related to the camera

position that are automatically tracked for the eﬀect:

float4x4 WorldXf : World;

float4x4 WorldITXf : WorldInverseTranspose;

float4x4 WvpXf : WorldViewProjection;

The syntax is type id : semantic.Thetypefloat4x4 is used for matrices,

the name is user deﬁned, and the semantic is a built-in name. As the se-

mantic names imply, the WorldXf is the model-to-world transform matrix,

the WorldITXf is the inverse transpose of this matrix, and the WvpXf is

the matrix that transforms from model space to the camera’s clip space.

These values with recognized semantics are expected to be provided by the

application and not shown in the user interface.

Next, the user-deﬁned variables are speciﬁed:

float3 Lamp0Pos : Position <

string Object = "PointLight0";

string UIName = "Lamp 0 Position";

string Space = "World";

> = {-0.5f, 2.0f, 1.25f};

float3 WarmColor <

string UIName = "Gooch Warm Tone";

string UIWidget = "Color";

> = {1.3f, 0.9f, 0.15f};

3.8. Effects 47

float3 CoolColor <

string UIName = "Gooch Cool Tone";

string UIWidget = "Color";

> = {0.05f, 0.05f, 0.6f};

Here some additional annotations are provided inside the angle brackets

“<>” and then default values are assigned. The annotations are application-

speciﬁc and have no meaning to the eﬀect or to the shader compiler. Such

annotations can be queried by the application. In this case the annotations

describe how to expose these variables within the user interface.

Data structures for shader input and output are deﬁned next:

struct appdata {

float3 Position : POSITION;

float3 Normal : NORMAL;

};

struct vertexOutput {

float4 HPosition : POSITION;

float3 LightVec : TEXCOORD1;

float3 WorldNormal : TEXCOORD2;

};

The appdata deﬁnes what data is at each vertex in the model and so

deﬁnes the input data for the vertex shader program. The vertexOutput

is what the vertex shader produces and the pixel shader consumes. The

use of TEXCOORD* as the output names is an artifact of the evolution of the

pipeline. At ﬁrst, multiple textures could be attached to a surface, so these

additional dataﬁelds are called texture coordinates. In practice, these ﬁelds

hold any data that is passed from the vertex to the pixel shader.

Next, the various shader program code elements are deﬁned. We have

only one vertex shader program:

vertexOutput std_VS(appdata IN) {

vertexOutput OUT;

float4 No = float4(IN.Normal,0);

OUT.WorldNormal = mul(No,WorldITXf).xyz;

float4 Po = float4(IN.Position,1);

float4 Pw = mul(Po,WorldXf);

OUT.LightVec = (Lamp0Pos - Pw.xyz);

OUT.HPosition = mul(Po,WvpXf);

return OUT;

}

48 3. The Graphics Processing Unit

This program ﬁrst computes the surface’s normal in world space by using

a matrix multiplication. Transforms are the subject of the next chapter,

so we will not explain why the inverse transpose is used here. The position

in world space is also computed by applying the oﬀscreen transform. This

location is subtracted from the light’s position to obtain the direction vector

from the surface to the light. Finally, the object’s position is transformed

into clip space, for use by the rasterizer. This is the one required output

from any vertex shader program.

Given the light’s direction and the surface normal in world space, the

pixel shader program computes the surface color:

float4 gooch_PS(vertexOutput IN) : COLOR

{

float3 Ln = normalize(IN.LightVec);

float3 Nn = normalize(IN.WorldNormal);

float ldn = dot(Ln,Nn);

float mixer = 0.5 * (ldn + 1.0);

float4 result = lerp(CoolColor, WarmColor, mixer);

return result;

}

The vector Ln is the normalized light direction and Nn the normalized

surface normal. By normalizing, the dot product ldn of these two vectors

then represents the cosine of the angle between them. We want to linearly

interpolate between the cool and warm tones using this value. The function

lerp() expects a mixer value between 0 and 1, where 0 means to use the

CoolColor,1theWarmColor, and values in between to blend the two. Since

the cosine of an angle gives a value from [−1, 1], the mixer value transforms

this range to [0, 1]. This value then is used to blend the tones and produce

a fragment with the proper color. These shaders are functions. An eﬀect

ﬁle can consist of any number of functions and can include commonly used

functions from other eﬀects ﬁles.

A pass typically consists of a vertex and pixel (and geometry) shader,

along with any state settings needed for the pass. A technique is a set of

one or more passes to produce the desired eﬀect. This simple ﬁle has one

technique, which has one pass:

technique Gooch < string Script = "Pass=p0;"; > {

pass p0 < string Script = "Draw=geometry;"; > {

VertexShader = compile vs_2_0 std_VS();

PixelShader = compile ps_2_a gooch_PS();

A pass can also have no shaders and control the ﬁxed-function pipeline, in DirectX

9 and earlier.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 3. The Graphics Processing Unit (4/5)

Create new playlist

Sign In

Sign Up

Table of Contents for
3. The Graphics Processing Unit (4/5)