i
i
i
i
i
i
i
i
834 18. Graphics Hardware
Figure 18.4. For single buffering, the front buffer is always shown. For double buffering,
buffer 0 is first in front and buffer 1 is in the back. Then they swap from front-to-back
and vice versa for each frame. Triple buffering works by having a pending buffer as
well. Here, a buffer is first cleared, and rendering to it is begun (pending). Second,
the system continues to use the buffer for rendering until the image has been completed
(back). Finally, the buffer is shown (front).
to the color buffer as the beam of the monitor passes those areas that are
being drawn. Sometimes called tearing, because the image displayed looks
as if it were briefly ripped in two, this is not a desirable feature for real-time
graphics.
4
To avoid the visibility problem, double buffering is commonly used. In
this scheme, a finished image is shown in the front buffer, while an off-
screen back buffer contains the scene that is currently being drawn. The
back buffer and the front buffer are then swapped by the graphics driver,
typically during vertical retrace to avoid tearing. The swap does not have
to occur during retrace; instantly swapping is useful for benchmarking a
rendering system, but is also used in many applications because it max-
imizes frame rate. Immediately after the swap, the (new) back buffer is
then the recipient of graphics commands, and the new front buffer is shown
to the user. This process is shown in Figure 18.4. For applications that
control the whole screen, the swap is normally implemented using a color
buffer flipping technique [204], also known as page flipping. This means
that the front buffer is associated with the address of a special register.
This address points to the pixel at the position (0, 0), which may be at the
4
On some ancient systems, like the old Amiga, you could actually test where the
beam was and so avoid drawing there, thus allowing single buffering to work.
i
i
i
i
i
i
i
i
18.1. Buffers and Buffering 835
lower or upper left corner. When buffers are swapped, the address of the
register is changed to the address of the back buffer.
For windowed applications, the common way to implement swapping is
to use a technique called BLT swapping [204] or, simply, blitting.
The double buffer can be augmented with a second back buffer, which
we call the pending buer. This is called triple buffering [832]. The pending
buffer is similar to the back buffer in that it is also offscreen, and in that
it can be modified while the front buffer is being displayed. The pending
buffer becomes part of a three-buffer cycle. During one frame, the pending
buffer can be accessed. At the next swap, it becomes the back buffer, where
the rendering is completed. Then it becomes the front buffer and is shown
to the viewer. At the next swap, the buffer again turns into a pending
buffer. This course of events is visualized at the bottom of Figure 18.4.
Triple buffering has one major advantage over double buffering. Using
it, the system can access the pending buffer while waiting for the vertical
retrace. With double buffering, a swap can stall the graphics pipeline.
While waiting for the vertical retrace so a swap can take place, a double-
buffered construction must simply be kept waiting. This is so because the
front buffer must be shown to the viewer, and the back buffer must remain
unchanged because it has a finished image in it, waiting to be shown. The
drawback of triple buffering is that the latency increases up to one entire
frame. This increase delays the reaction to user inputs, such as keystrokes
and mouse or joystick moves. Control can become sluggish because these
user events are deferred after the rendering begins in the pending buffer.
Some hardcore game players will even turn off vertical sync and accept
tearing in order to minimize latency [1329].
In theory, more than three buffers could be used. If the amount of time
to compute a frame varies considerably, more buffers give more balance
and an overall higher display rate, at the cost of more potential latency.
To generalize, multibuffering can be thought of as a ring structure. There
is a rendering pointer and a display pointer, each pointing at a different
buffer. The rendering pointer leads the display pointer, moving to the next
buffer when the current rendering buffer is done being computed. The only
rule is that the display pointer should never be the same as the rendering
pointer.
A related method of achieving additional acceleration for PC graphics
accelerators is to use SLI mode. Back in 1998 3dfx used SLI as an acronym
for scanline interleave, where two graphics chipsets run in parallel, one
handling the odd scanlines, the other the even. NVIDIA (who bought 3dfx’s
assets) uses this abbreviation for an entirely different way of connecting two
(or more) graphics cards, called scalable link interface. ATI/AMD calls it
CrossFire X. This form of parallelism divides the work by either splitting
the screen into two (or more) horizontal sections, one per card, or by having
i
i
i
i
i
i
i
i
836 18. Graphics Hardware
each card fully render its own frame, alternating output. There is also a
mode that allows the cards to accelerate antialiasing of the same frame.
The most common use is having each GPU render a separate frame, called
alternate frame rendering (AFR). While this scheme sounds as if it should
increase latency, it can often have littleornoeect. SayasingleGPU
system renders at 10 frames per second (fps). If the GPU is the bottleneck,
two GPUs using AFR could render at 20 fps, or even four at 40 fps. Each
GPU takes the same amount of time to render its frame, so latency does
not necessarily change.
18.1.4 Stereo and Multi-View Graphics
In stereo rendering, two images are used in order to make objects look more
three dimensional. With two eyes, the visual system takes two views, and
in combining them, retrieves depth information. This ability is called stere-
opsis, stereo vision [349, 408], or binocular parallax. The idea behind stereo
vision is to render two images, one for the left eye and one for the right eye
(as shown in Figure 18.5), and then use some technique that ensures that
the human viewer experiences a depth in the rendered image. These two
images are called the stereo pair. One common method for creating stereo
vision is to generate two images, one in red and one in green (or cyan, by
rendering to both the blue and green channels), composite these images,
then view the result with red-green glasses. In this case, only a normal sin-
gle display buffer is needed, but display of color is problematic. For color
images, the solution can be as simple as having two small screens, one in
Figure 18.5. Stereo rendering. Note that the image plane is shared between the two
frustums, and that the frustums are asymmetric. Such rendering would be used on, for
example, a stereo monitor. Shutter glasses could use separate image planes for each eye.
i
i
i
i
i
i
i
i
18.1. Buffers and Buffering 837
front of each (human) eye, in a head-mounted display. Another hardware
solution is the use of shutter glasses, in which only one eye is allowed to
view the screen at a time. Two different views can be displayed by rapidly
alternating between eyes and synchronizing with the monitor [210].
For these latter forms of stereography, two separate display buffers are
needed. When viewing is to take place in real time, double buffering is
used in conjunction with stereo rendering. In this case, there have to be
two front and two back buffers (one set for each eye), so the color buffer
memory requirement doubles.
Other technologies that provide full color without the need for glasses
are possible. Such displays are therefore often called autostereoscopic [267].
The basic idea is to modify the display surface in some way. One technique
is covering the LCD with a plastic sheet of vertical (or near vertical) lenses
(half cylinders) or prisms that refract the light so that alternating pixel
columns are directed toward the left and right eyes. These displays are
called lenticular displays. Another mechanism is to place black vertical
stripes (which may be implemented as another LCD) a small distance in
front of the LCD to mask out alternating columns from each eye. These
are called parallax barrier displays.
So far, only stereo rendering have been considered, where two views are
displayed. However, it is possible to build displays with many more views as
well. There are commercial displays with nine views, and research displays
with more than 80 views. In general, these displays are often called multi-
view displays. The idea is that such displays also can provide another type
of depth cue called motion parallax. This means that the human viewer
can move the head to, say, the left, in order to look “around a corner.” In
addition, it makes it easier for more than one person to look at the display.
Systems for three-dimensional TV and video have been built, and stan-
dardization work is proceeding for defining how TV and video signals will
be sent for a wide variety of displays. The transition from black-and-white
TVs to color was a large step forward for TV, and it may be that the
transition to three-dimensional TV could be as large. When rendering to
these displays, it is simple to use brute force techniques: Either just ren-
der each image in sequence, or put a number of graphics cards into the
computer, and let each render one image. However, there is clearly a lot
of coherency among these images, and by rendering a triangle to all views
simultaneously and using a sorted traversal order, texture cache content
can be exploited to a great extent, which speeds up rendering [512].
18.1.5 Buffer Memory
Here, a simple example will be given on how much memory is needed
for the different buffers in a graphics system. Assume that we have a
i
i
i
i
i
i
i
i
838 18. Graphics Hardware
color buffer of 1280 × 1024 pixels with true colors, i.e., 8 bits per color
channel. With 32 bits per color, this would require 1280 × 1024 × 4=5
megabytes (MB). Using double buffering doubles this value to 10 MB. Also,
let us say that the Z-buffer has 24 bits per pixel and a stencil buffer of 8
bits per (these usually are paired to form a 32 bit word). The Z-buffer
and stencil buffer would then need 5 MB of memory. This system would
therefore require 10 + 5 = 15 MB of memory for this fairly minimal set
of buffers. Stereo buffers would double the color buffer size. Note that
under all circumstances, only one Z-buffer and stencil buffer are needed,
since at any moment they are always paired with one color buffer active
for rendering. When using supersampling or multisampling techniques to
improve quality, the amount of buffer memory increases further. Using,
say, four samples per pixel increases most buffers by a factor of four.
18.2 Perspective-Correct Interpolation
The fundamentals of how perspective-correct interpolation is done in a
rasterizer will briefly be described here. This is important, as it forms
the basis of how rasterization is done so that textures look and behave
correctly on primitives. As we have seen, each primitive vertex, v,isper-
spectively projected using any of Equations 4.68-4.70. A projected vertex,
p =(p
x
w, p
y
w, p
z
w, w), is obtained. We use w = p
w
here to simplify the
presentation. After division by w we obtain (p
x
,p
y
,p
z
, 1). Recall that
1 p
z
1 for the OpenGL perspective transform. However, the stored
z-value in the Z-buffer is in [0, 2
b
1], where b is the number of bits in
the Z-buffer. This is achieved with a simple translation and scale of p
z
.
Also, each vertex may have a set of other parameters associated with it,
e.g., texture coordinates (u, v), fog, and color, c.
The screen position (p
x
,p
y
,p
z
) can be correctly interpolated linearly
over the triangle, with no need for adjustment. In practice, this is often
done by computing delta slopes:
Δz
Δx
and
Δz
Δy
. These slopes represent how
much the p
z
value differs between two adjacent pixels in the x-andy-
directions, respectively. Only a simple addition is needed to update the
p
z
value when moving from one pixel to its neighbor. However, it is im-
portant to realize that the colors and especially texture coordinates cannot
normally be linearly interpolated. The result is that improper foreshort-
ening due to the perspective effect will be achieved. See Figure 18.6 for
a comparison. To solve this, Heckbert and Moreton [521] and Blinn [103]
show that 1/w and (u/w, v/w) can be linearly interpolated. Then the in-
terpolated texture coordinates are divided by the interpolated 1/w to get
the correct texture location. That is, (u/w, v/w)/(1/w)=(u, v). This type
of interpolation is called hyperbolic interpolation, because a graph of the
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset