i
i
i
i
i
i
i
i
5.6. Aliasing and Antialiasing 129
Figure 5.30. MSAA sampling patterns for ATI and NVIDIA graphics accelerators. The
green square is the location of the shading sample, and the red squares are the positional
samples computed and saved. From left to right: 2x, 4x, 6x (ATI), and 8x (NVIDIA)
sampling. (Generated by the D3D FSAA Viewer.)
MSAA is faster than a pure supersampling scheme because the fragment
is shaded only once. It focuses effort on sampling the fragment’s pixel cov-
erage at a higher rate and sharing the computed shade. However, actually
storing a separate color and z-depth for each sample is usually unnecessary.
CSAA takes advantage of this observation by storing just the coverage for
the fragment at a higher sampling rate. Each subpixel stores an index to
the fragment with which it is associated. A table with a limited number
of entries (four or eight) holds the color and z-depth associated with each
fragment. For example, for a table holding four colors and z-depths, each
subpixel needs only two bits of storage to index into this table. In theory
a pixel with 16 samples could hold 16 different fragments, in which case
CSAA would be unable to properly store the information needed, possibly
resulting in artifacts. For most data types it is relatively rare to have more
than four fragments that are radically different in shade visible in a pixel,
so this scheme performs well in practice.
This idea of separating coverage from shading and depth storage is
similar to Carpenter’s A-buffer [157], another form of multisampling. This
algorithm is commonly used in software for generating high-quality render-
ings, but at noninteractive speeds. In the A-buffer, each polygon rendered
creates a coverage mask for each screen grid cell it fully or partially covers.
See Figure 5.31 for an example of a coverage mask.
Similar to MSAA, the shade for the polygon associated with this cov-
erage mask is typically computed once at the centroid location on the
fragment and shared by all samples. The z-depth is also computed and
stored in some form. Some systems compute a pair of depths, a minimum
and maximum. Others also retain the slope of the polygon, so that the
exact z-depth can be derived at any sample location, which allows inter-
penetrating polygons to be rendered properly [614]. The coverage mask,
shade, z-depth, and other information make up a stored A-buffer fragment.
A critical way that the A-buffer differs from the Z-buffer is that a
screen grid cell can hold any number of fragments at one time. As they
i
i
i
i
i
i
i
i
130 5. Visual Appearance
Figure 5.31. The corner of a polygon partially covers a single screen grid cell associ-
ated with a pixel. The grid cell is shown subdivided into a 4 × 4 subgrid, and those
cells that are considered covered are marked by a 1. The 16-bit mask for this cell is
0000 0111 1111 0111.
collect, fragments can be discarded if they are hidden. For example, if an
opaque fragment A has a coverage mask that fully covers fragment B,and
fragment A has a maximum z-depth less than the minimum z-depth of
B, then fragment B can be safely discarded. Coverage masks can also be
merged and used together. For example, if one opaque fragment covers one
part of a pixel and another opaque fragment covers another, their masks can
be logically ORed together and the larger of their maximum z-depths used
to form a larger area of coverage. Because of this merging mechanism,
fragments are often sorted by z-depth. Depending on the design, such
merging can happen when a fragment buffer becomes filled, or as a final
step before shading and display.
Once all polygons have been sent to the A-buffer, the color to be stored
in the pixel is computed. This is done by determining how much of the
mask of each fragment is visible, and then multiplying this percentage
by the fragment’s color and summing the results. See Figure 5.18 for an
example of multisampling hardware in use. Transparency effects, one of
the A-buffer’s strengths, are also folded in at this time.
Though this sounds like an involved procedure, many of the mecha-
nisms for rendering a triangle into a Z-buffer can be reused to implement
the A-buffer in hardware [1362]. Storage and computation requirements are
usually considerably lower than for simple FSAA, as the A-buffer works by
storing a variable number of fragments for each pixel. One problem with
this approach is that there is no upper limit on the number of semitrans-
parent fragments stored. Jouppi & Chang [614] present the Z
3
algorithm,
which uses a fixed number of fragments per pixel. Merging is forced to
i
i
i
i
i
i
i
i
5.6. Aliasing and Antialiasing 131
Figure 5.32. A typical jitter pattern for three pixels, each divided into a 3 × 3setof
subcells. One sample appears in each subcell, in a random location.
occur when the storage limit is reached. Bavoil et al. [74, 77] generalize
this approach in their k-buffer architecture. Their earlier paper also has a
good overview of the extensive research that has been done in the field.
While all of these antialiasing methods result in better approximations
of how each polygon covers a grid cell, they have some limitations. As
discussed in the previous section, a scene can be made of objects that are
arbitrarily small on the screen, meaning that no sampling rate can ever
perfectly capture them. So, a regular sampling pattern will always exhibit
some form of aliasing. One approach to avoiding aliasing is to distribute
the samples randomly over the pixel, with a different sampling pattern
at each pixel. This is called stochastic sampling,andthereasonitworks
better is that the randomization tends to replace repetitive aliasing effects
with noise, to which the human visual system is much more forgiving [404].
The most common kind of stochastic sampling is jittering,aformof
stratified sampling,whichworksasfollows. Assumethatn samples are
to be used for a pixel. Divide the pixel area into n regions of equal area,
and place each sample at a random location in one of these regions.
11
See
Figure 5.32. The final pixel color is computed by some averaged mean of
the samples. N -rooks sampling is another form of stratified sampling, in
which n samples are placed in an n ×n grid, with one sample per row and
column [1169]. Incidently, a form of N -rooks pattern is used in the Infinite-
Reality [614, 899], as it is particularly good for capturing nearly horizontal
and vertical edges. For this architecture, the same pattern is used per pixel
and is subpixel-centered (not randomized within the subpixel), so it is not
performing stochastic sampling.
AT&T Pixel Machines and Silicon Graphics’ VGX, and more recently
ATI’s SMOOTHVISION scheme, use a technique called interleaved sam-
pling. In ATI’s version, the antialiasing hardware allows up to 16 samples
per pixel, and up to 16 different user-defined sampling patterns that can
11
This is in contrast to accumulation buffer screen offsets, which can allow random
sampling of a sort, but each pixel has the same sampling pattern.
i
i
i
i
i
i
i
i
132 5. Visual Appearance
Figure 5.33. On the left, antialiasing with the accumulation buffer, with four samples
per pixel. The repeating pattern gives noticeable problems. On the right, the sampling
pattern is not the same at each pixel, but instead patterns are interleaved. (Images
courtesy of Alexander Keller and Wolfgang Heidrich.)
be intermingled in a repeating pattern (e.g., in a 4 × 4 pixel tile, with a
different pattern in each). The sampling pattern is not different for every
pixel cell, as is done with a pure jittering scheme. However, Molnar [894],
as well as Keller and Heidrich [642], found that using interleaved stochas-
tic sampling minimizes the aliasing artifacts formed when using the same
pattern for every pixel. See Figure 5.33. This technique can be thought of
as a generalization of the accumulation buffer technique, as the sampling
pattern repeats, but spans several pixels instead of a single pixel.
There are still other methods of sampling and filtering. One of the
best sampling patterns known is Poisson disk sampling, in which nonuni-
formly distributed points are separated by a minimum distance [195, 260].
Molnar presents a scheme for real-time rendering in which unweighted
samples are arranged in a Poisson disk pattern with a Gaussian filtering
kernel [894].
One real-time antialiasing scheme that lets samples affect more than one
pixel is NVIDIA’s older Quincunx method [269], also called high resolution
antialiasing (HRAA). “Quincunx” means an arrangement of five objects,
four in a square and the fifth in the center, such as the pattern of five dots
on a six-sided die. In this multisampling antialiasing scheme, the sampling
pattern is a quincunx, with four samples at the pixel cell’s corners and
one sample in the center (see Figure 5.29). Each corner sample value is
distributed to its four neighboring pixels. So instead of weighting each
sample equally (as most other real-time schemes do), the center sample is
given a weight of
1
2
, and each corner sample has a weight of
1
8
. Because of
this sharing, an average of only two samples are needed per pixel for the
Quincunx scheme, and the results are considerably better than two-sample
FSAA methods. This pattern approximates a two-dimensional tent filter,
which, as discussed in the previous section, is superior to the box filter.
While the Quincunx method itself appears to have died out, some newer
GPUs are again using sharing of a single sample’s results among pixels
i
i
i
i
i
i
i
i
5.6. Aliasing and Antialiasing 133
Figure 5.34. To the left, the RGSS sampling pattern is shown. This costs four samples
per pixel. By moving these out to the pixel edges, sample sharing can occur across
edges. However, for this to work out, every other pixel must have a reflected sample
pattern, as shown on the right. The resulting sample pattern is called FLIPQUAD and
costs two samples per pixel
for higher quality signal reconstruction. For example, the ATI Radeon
HD 2000 introduced custom filter antialiasing (CFAA) [1166], with the
capabilities of using narrow and wide tent filters that extend slightly into
other pixel cells. It should be noted that for results of the highest visual
quality, samples should be gathered from a much larger area, such as a 4×4
or even 5 × 5 pixel grid [872]; at the time of writing, Sun Microsystem’s
defunct XVR-4000 is the only graphics hardware to perform this level of
filtering in real time [244].
It should be noted that sample sharing is also used for antialiasing in
the mobile graphics context [13]. The idea is to combine the clever idea
of sample sharing in Quinqunx with RGSS (Figure 5.29). This results
in a pattern, called FLIPQUAD, which costs only two samples per pixel,
and with similar quality to RGSS (which costs four samples per pixel).
The sampling pattern is shown in Figure 5.34. Other inexpensive sampling
patterns that exploit sample sharing are explored by Hasselgren et al. [510].
Molnar presents a sampling pattern for performing adaptive refine-
ment [83]. This is a technique where the image improves over time by
increasing the number of samples taken. This scheme is useful in interac-
tive applications. For example, while the scene is changing, the sampling
rate is kept low; when the user stops interacting and the scene is static,
the image improves over time by having more and more samples taken and
the intermediate results displayed. This scheme can be implemented using
an accumulation buffer that is displayed after certain numbers of samples
have been taken.
The rate of sampling can be varied depending on the scene information
being sampled. For example, in MSAA the focus is on increasing the num-
ber of samples to improve edge detection while computing shading samples
at lower rates. Persson [1007] discusses how the pixel shader can be used to
improve quality on a per-surface basis by gathering additional texture sam-
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset