i
i
i
i
i
i
i
i
15.4. Optimization 711
Combining can be done one time and the buffer reused each frame for
sets of objects that are static. For dynamic objects, a single buffer can be
filled with a number of meshes. The limitation of this basic approach is
that all objects in a mesh need to use the same set of shader programs, i.e.,
the same material. However, it is possible to merge objects with different
colors, for example, by tagging each object’s vertices with an identifier.
This identifier is used by a shader program to look up what color is used
to shade the object. This same idea can be extended to other surface
attributes. Similarly, textures attached to surfaces can also hold identifiers
as to which material to use. Light maps of separate objects need to be
combined into texture atlases or arrays [961].
However, such practices can be taken too far. Adding branches and
different shading models to a single pixel shader program can be costly.
Sets of fragments are processed in parallel. If all fragments do not take
the same branch, then both branches must be evaluated for all fragments.
Care has to be taken to avoid making pixel shader programs that use an
excessive number of registers. The number of registers used influences the
number of fragments that a pixel shader can handle at the same time in
parallel. See Section 18.4.2.
The other approach to minimize API calls is to use some form of in-
stancing. Most APIs support the idea of having an object and drawing it
a number of times in a single call. So instead of making a separate API
call for each tree in a forest, you make one call that renders many copies
of the tree model. This is typically done by specifying a base model and
providing a separate data structure that holds information about each spe-
cific instance desired. Beyond position and orientation, other attributes
could be specified per instance, such as leaf colors or curvature due to the
wind, or anything else that could be used by shader programs to affect the
model. Lush jungle scenes can be created by liberal use of instancing. See
Figure 15.3. Crowd scenes are a good fit for instancing, with each character
appearing unique by having different body parts from a set of choices. Fur-
ther variation can be added by random coloring and decals [430]. Instancing
can also be combined with level of detail techniques [158, 279, 810, 811].
See Figure 15.4 for an example.
In theory, the geometry shader could be used for instancing, as it can
create duplicate data of an incoming mesh. In practice, this method is often
slower than using instancing API commands. The intent of the geometry
shader is to perform local, small scale amplification of data [1311].
Another way for the application to improve performance is to minimize
state changes by grouping objects with a similar rendering state (vertex and
pixel shader, texture, material, lighting, transparency, etc.) and rendering
them sequentially. When changing the state, there is sometimes a need
to wholly or partially flush the pipeline. For this reason, changing shader