i
i
i
i
i
i
i
i
Chapter 14
Acceleration Algorithms
“Now here, you see, it takes all the running you can do to keep
in the same place. If you want to get somewhere else, you
must run at least twice as fast as that!”
—Lewis Carroll
One of the great myths concerning computers is that one day we will have
enough processing power. Even in a relatively simple application like word
processing, we find that additional power can be applied to all sorts of
features, such as on-the-fly spell and grammar checking, antialiased text
display, voice recognition and dictation, etc.
In real-time rendering, we have at least four performance goals: more
frames per second, higher resolution and sampling rates, more realistic ma-
terials and lighting, and increased complexity. A speed of 60–85 frames per
second is generally considered a fast enough frame rate; see the Introduc-
tion for more on this topic. Even with motion blurring (Section 10.14),
which can lower the frame rate needed for image quality, a fast rate is still
needed to minimize latency when interacting with a scene [1329].
IBM’s T221 LCD display offers 3840 × 2400 resolution, with 200 dots
per inch (dpi) [1329]. A resolution of 1200 dpi, 36 times the T221’s den-
sity, is offered by many printer companies today. A problem with the
T221 is that it is difficult to update this many pixels at interactive rates.
Perhaps 1600 × 1200 is enough resolution for a while. Even with a limit
on screen resolution, antialiasing and motion blur increase the number of
samples needed for generating high-quality images. As discussed in Sec-
tion 18.1.1, the number of bits per color channel can also be increased,
which drives the need for higher precision (and therefore more costly)
computations.
As previous chapters have shown, describing and evaluating an object’s
material can be computationally complex. Modeling the interplay of light
645
i
i
i
i
i
i
i
i
646 14. Acceleration Algorithms
Figure 14.1. A “reduced” Boeing model with a mere 350 million triangles rendered with
ray tracing. Sectioning is performed by using a user-defined clipping plane. This model
can be rendered at 15 fps without shadows on 16 Opteron cores, using more flexible but
untuned OpenRT ray tracing code. It can also be rendered at about 4 fps at 640 × 480
with shadows, on heavily tuned code on a dual processor. Once camera movement stops,
the image can be progressively enhanced with antialiasing and soft shadowing. (Image
courtesy of Computer Graphics Group, Saarland University. Source 3D data provided
by and used with permission of the Boeing Company.)
and surface can absorb an arbitrarily high amount of computing power.
This is true because an image should ultimately be formed by the contri-
butions of light traveling down a limitless number of paths from an illumi-
nation source to the eye.
Frame rate, resolution, and shading can always be made more com-
plex, but there is some sense of diminishing returns to increasing any of
these. There is no real upper limit on scene complexity. The render-
ing of a Boeing 777 includes 132,500 unique parts and over 3,000,000 fas-
teners, which would yield a polygonal model with over 500,000,000 poly-
gons [206]. See Figure 14.1. Even if most of those objects are not seen
due to size or position, some work must be done to determine that this
is the case. Neither Z-buffering nor ray tracing can handle such mod-
els without the use of techniques to reduce the sheer number of compu-
tations needed. Our conclusion: Acceleration algorithms will always be
needed.
In this chapter, a sm¨org˚asbord of algorithms for accelerating computer
graphics rendering, in particular the rendering of large amounts of geome-
try, will be presented and explained. The core of many such algorithms is
based on spatial data structures, which are described in the next section.
Based on that knowledge, we then continue with culling techniques.These
are algorithms that try to find out which objects are at all visible and need
to be treated further. Level of detail techniques reduce the complexity of
rendering the remaining objects. Finally, systems for rendering very large
models, and point rendering algorithms, are briefly discussed.
i
i
i
i
i
i
i
i
14.1. Spatial Data Structures 647
14.1 Spatial Data Structures
A spatial data structure is one that organizes geometry in some n-
dimensional space. Only two- and three-dimensional structures are used
in this book, but the concepts can often easily be extended to higher di-
mensions. These data structures can be used to accelerate queries about
whether geometric entities overlap. Such queries are used in a wide variety
of operations, such as culling algorithms, during intersection testing and
ray tracing, and for collision detection.
The organization of spatial data structures is usually hierarchical. This
means, loosely speaking, that the topmost level encloses the level below it,
which encloses the level below that level, and so on. Thus, the structure
is nested and of recursive nature. The main reason for using a hierar-
chy is that different types of queries get significantly faster, typically an
improvement from O(n)toO(log n). It should also be noted that the con-
struction of most spatial data structures is expensive and is often done as
a preprocess. Lazy evaluation and incremental updates are possible in real
time.
Some common types of spatial data structures are bounding volume hier-
archies (BVHs), variants of binary space partitioning (BSP) trees,
quadtrees, and octrees. BSP trees and octrees are data structures based
on space subdivision. This means that the entire space of the scene is sub-
divided and encoded in the data structure. For example, the union of the
space of all the leaf nodes is equal to the entire space of the scene. Nor-
mally the leaf nodes’ volumes do not overlap, with the exception of less
common structures such as loose octrees. Most variants of BSP trees are
irregular, which loosely means that the space can be subdivided more ar-
bitrarily. The octree is regular, meaning that space is split in a uniform
fashion. Though more restrictive, this uniformity can often be a source
of efficiency. A bounding volume hierarchy, on the other hand, is not a
space subdivision structure. Rather, it encloses the regions of the space
surrounding geometrical objects, and thus the BVH need not enclose all
space.
BVHs, BSP trees, and octrees are all described in the following sections,
along with the scene graph, which is a data structure more concerned with
model relationships than with efficient rendering.
14.1.1 Bounding Volume Hierarchies
A bounding volume (BV) is a volume that encloses a set of objects. The
point of a BV is that it should be a much simpler geometrical shape than
the contained objects, so that doingtestsusingaBVcanbedonemuch
faster than using the objects themselves. Examples of BVs are spheres,
axis-aligned bounding boxes (AABBs), oriented bounding boxes (OBBs),
i
i
i
i
i
i
i
i
648 14. Acceleration Algorithms
Figure 14.2. The left part shows a simple scene with six objects, each enclosed by a
bounding sphere. The bounding spheres are then grouped together into larger bounding
spheres, until all objects are enclosed by the largest sphere. The right part shows the
bounding volume hierarchy (tree) that is used to represent the object hierarchy on the
left. The BV of the root encloses all objects in the scene.
and k-DOPs (see Section 16.2 for definitions). A BV does not contribute
visually to the rendered image. Instead, it is used as a proxy in place of the
bounded objects, to speed up rendering, selection, and other computations
and queries.
For real-time rendering of three-dimensional scenes, the bounding vol-
ume hierarchy (BVH) is often used for hierarchical view frustum culling
(see Section 14.3). The scene is organized in a hierarchical tree structure,
consisting of a root, internal nodes, and leaves. The topmost node is the
root, which has no parents. A leaf node holds the actual geometry to be
rendered, and it does not have any children. In contrast, an internal node
has pointers to its children. The root is thus an internal node, unless it
is the only node in the tree. Each node, including leaf nodes, in the tree
has a bounding volume that encloses the geometry in its entire subtree—
thus the name bounding volume hierarchy. This means that the root has
a BV that contains the entire scene. An example of a BVH is shown in
Figure 14.2.
The underlying structure of a BVH is a tree, and in the field of com-
puter science the literature on tree data structures is vast. Here, only a
few important results will be mentioned. For more information, see, for
example, the book Introduction to Algorithms by Cormen et al. [201].
Consider a k-ary tree, that is, a tree where each internal node has k
children. A tree with only one node (the root) is said to be of height 0.
A leaf node of the root is at height 1, and so on. A balanced tree is a
tree in which all leaf nodes either are at height h or h 1. In general, the
i
i
i
i
i
i
i
i
14.1. Spatial Data Structures 649
height, h, of a balanced tree is log
k
n,wheren is the total number of
nodes (internal and leaves) in the tree.
A full tree is one where all leaf nodes are at the same height, h.Some
properties of (only) full trees follow. The total number of nodes can be
computed as a geometric sum:
n = k
0
+ k
1
+ ...k
h1
+ k
h
=
k
h+1
1
k 1
. (14.1)
Thus, the number of leaf nodes, l,isl = k
h
, and the number of internal
nodes, i,isi = nl =
k
h
1
k1
. Assuming that only the number of leaf nodes,
l, is known, then the total number of nodes is n = i + l =
kl1
k1
.Fora
binary tree, where k =2,thisgivesn =2l 1. Note that a higher k gives a
tree with a lower height, which means that it takes fewer steps to traverse
the tree, but it also requires more work at each node. The binary tree is
often the simplest choice, and one that gives good performance. However,
there is evidence that a higher k (e.g., k =4ork = 8) gives slightly better
performance for some applications [731]. Using k =2,k =4,ork =8
makes it simple to construct trees; just subdivide along the longest axis for
k = 2, and for the two longest axes for k = 4, and for all axes for k =8. It
is much more difficult to form optimal trees for other values of k.
BVHs are also excellent for performing various queries. For example,
assume that a ray should be intersected with a scene, and the first inter-
section found should be returned. To use a BVH for this, testing starts at
the root. If the ray misses its BV, then the ray misses all geometry con-
tained in the BVH. Otherwise, testing continues recursively, that is, the
BVs of the children of the root are tested. As soon as a BV is missed by
the ray, testing can terminate on that subtree of the BVH. If the ray hits
a leaf node’s BV, the ray is tested against the geometry at this node. The
performance gains come partly from the fact that testing the ray with the
BV is fast. This is why simple objects such as spheres and boxes are used
as BVs. The other reason is the nesting of BVs, which allows us to avoid
testing large regions of space due to early termination in the tree.
Often the closest intersection, not the first found, is what is desired.
The only additional data needed are the distance and identity of the closest
object found while traversing the tree. The current closest distance is also
used to cull the tree during traversal. If a BV is intersected, but its distance
is beyond the closest distance found so far, then the BV can be discarded.
Normally we do not know which child of a BV is closest without testing,
so we must test all children in arbitrary order. As will be seen, a BSP
tree has an advantage over normal BVHs in that it can guarantee front-
to-back ordering. However, binary BVHs can also achieve this advantage
by keeping track of the axis used to split the set of objects into the two
children [132].
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset