i
i
i
i
i
i
i
i
14.6. Occlusion Culling 675
for more on latency). Hence, this GPU-based occlusion culling method is
worthwhile when the bounding boxes contain a large number of objects
and a relatively large amount of occlusion is occurring.
On HP’s VISUALIZE fx hardware (circa 2000), bounding boxes are
created automatically and queried for sufficiently long display lists [213].
It has been shown that using tighter bounding volumes can speed up ren-
dering. Bartz et al. [69] achieved a 50 percent increase in frame rate using
k-DOPs (26-DOPs) for mechanical CAD models (where the interiors of the
objects often are complex).
The GPU’s occlusion query has been used as the basic building block
for a number of algorithms. Meißner et al. [851] use occlusion queries in a
hierarchical setting. The scene is represented in a hierarchical data struc-
ture, such as a BVH or octree. First, view frustum culling is used to find
nodes not fully outside the view frustum. These are sorted, based on a
node’s center point (for example, the center of a bounding box), in front-
to-back order. The nearest leaf node is rendered without occlusion testing
to the frame buffer. Using the occlusion query, the BVs of subsequent
objects are tested. If a BV is visible, its contents are tested recursively,
or rendered. Klosowski and Silva have developed a constant-frame-rate al-
gorithm using an occlusion culling algorithm, called the prioritized-layered
projection algorithm [674]. However, at first this was not a conservative
algorithm, i.e., it sacrificed image quality in order to keep a constant frame
rate. Later, they developed a conservative version of the algorithm, using
occlusion queries [675].
The research just described was done with graphics hardware that had a
serious limitation: When an occlusion query was made (which was limited
to a boolean), the CPU’s execution was stalled until the query result was
returned. Modern GPUs have adopted a model in which the CPU can send
off any number of queries to the GPU, then periodically check to see if any
results are available. For its part, the GPU performs each query and puts
the result in a queue. The queue check by the CPU is extremely fast, and
theCPUcancontinuetosenddownqueries or actual renderable objects
without having to stall.
How to effectively use this model of a queue of queries is an active area
of research. The problem is that a query must be done after some actual
geometry has been rendered to the screen, so that the bounding box tested
is more likely to be occluded, but not so late in the rendering process
that the CPU is left waiting for the results of the query. Among other
optimizations, Sekulic [1145] recommends taking advantage of temporal
coherence. Objects are tested for occlusion, but the results are not checked
until the next frame. If an object was found to be occluded in the previous
frame, it is not rendered this frame, but it is tested for occlusion again.
This gives an approximate occlusion test, since an object could be visible