Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 5

Turning Vertices into Pixels

In the previous chapters we learned the basic algorithms for turning a specification of a three-dimensional scene (vertices, polygons, etc.) into a picture on the screen. In the following sections we show the details of how a geometric primitive becomes a region of colored pixels and also several optimizations that are commonly used for making it an efficient operation.

5.1 Rasterization

Rasterization comes into play after the vertex positions are expressed in viewport space (see Section 4.6.4). Unlike all the previous reference systems (in order: object space, world space, view space, NDC) this one is discrete and we have to decide how many fragments belong to the primitive and which discrete coordinates have to be assigned to the corresponding fragments.

5.1.1 Lines

What is called line in CG is actually a segment specified by the two end-points (x0, y0) and (x1, y1), and for the sake of simplicity we assume that the coordinates are integers. The segment can be expressed as:

$\begin{array}{l} y & = & y_{0} + m (x - x_{0}) (5.1) \\ m & = & \frac{y_{1} - y_{0}}{x_{1} - x_{0}} \end{array}$

From Equation (5.1) we can see that if x is incremented to x + 1, y is incremented to y + m. Note that if −1 ≤ m < 1 and we round to the closest integer the values of y, for all the values of x between x0 and x1 we will have a series of pixels such that there is exactly one pixel for each column and two consecutive pixels are always adjacent, as shown in Figure 5.1. If m > 1 or m < − 1 we can invert the roles of x and y and write the equation as:

$x = x_{0} + \frac{1}{m} (y - y_{0}) (5.2)$

Figure 5.1

Figure showing discrete differential analyzer algorithm examples.

Discrete differential analyzer algorithm examples.

and compute the values of x for y going from y0 to y1. This algorithm is called discrete difference analyzer (DDA) and is reported in Listing 5.1.

DDARasterizer (x0, y0, x1, y1) {

 float m = (y1 − y0) / (x1 − x0)

 if ((m >= −1) && (m <= 1)) {

 Δy = m;

 Δx = 1;

 else {

 Δy = 1;

 Δx = 1/m;

 x = x0; y = y0;

 do {

 OutputPixel(round(x), round(y));

 x = x + Δx;

 y = y + Δy;

LISTING 5.1: Discrete difference analyzer (DDA) rasterization algorithm.

The weakness of the DDA rasterization is that Δx and Δy are fractionary numbers, which at the same time slows down computation, and since the floating point representation has finite precision, causes error accumulation.

The Bresenham’s algorithm solves both problems with a formulation that only involves integer numbers. Figure 5.2 shows 4 lines with different slopes but all passing through the same pixel.

Figure 5.2

Figure showing bresenham’s algorithm. Schematization.

Bresenham’s algorithm. Schematization.

Let us assume the case 0 ≤ m ≤ 1, that we are performing the DDA rasterization and that we have just outputted pixel (i, j), so we are now computing the y value for i +1. We can see that:

$\begin{array}{l} if j + Δ y \geq j + \frac{1}{2} \to O u t p u t P i x e l (i + 1, j + 1) (5.3) \\ if j + Δ y < j + \frac{1}{2} \to O u t p u t P i x e l (i + 1, j) \end{array}$

or, stated in other words, if the lines pass over $(i + 1, j + \frac{1}{2})$ the next pixel will be (i + 1, j + 1), otherwise it will be (i + 1, j). It is useful to write the line equation

$y = \frac{Δ y}{Δ x} x + c (5.4)$

with the implicit formulation:

$y Δ x - x Δ y - c Δ x = 0. (5.5)$

For compactness we define the function $F (x, y) = - Δ y x + Δ x y - c .$ Note that

$F (x, y) = {\begin{array}{l} 0 & if (x, y) belongs to the line \\ > 0 & if (x, y) is above the line (5.6) \\ < 0 & if (x, y) is below the line \end{array}$

We define a boolean variable $D_{i} = F (i + 1, j + \frac{1}{2}),$ called decision variable, and replace the condition in Equation (5.3) to get the following relations:

$\begin{array}{l} if D_{i} \geq 0 \to O u t p u t P i x e l (i + 1, j + 1) (5.7) \\ if D_{i} < 0 \to O u t p u t P i x e l (i + 1, j) \end{array}$

Up to now we are still dealing with the DDA rasterization, we just wrote the formulas in a slightly different way. The good news is that, as shown in the table in Figure 5.2, the decision variable for i + 1 depends only on the decision variable for i. Let us have a look at how the decision variable changes between two consecutive pixels:

$Δ D_{i} = D_{i + 1} - D_{i} = {\begin{matrix} Δ x - Δ y & if D_{i} \geq 0 \\ - Δ y & if D_{i} < 0 \end{matrix} (5.8)$

This means that we do not have to recompute D for each value of x, but we can just update its value at each step. Note that even though all ΔDi’s are integers, the first value for F is still a fraction:

$F (i + 1, j + \frac{1}{2}) = - Δ y (i + 1) Δ x (j + \frac{1}{2}) - Δ x c (5.9)$

However, since we only use the sign of F, we simply multiply both F and ΔDi, by 2 and obtain integer-only operations:

$F (i, j) = 2 x j - 2 Δ y i - 2 Δ x c (5.10)$

So, if our line starts from pixel (i0, j0) and ends at pixel (i1, j1) we can compute the initial value of the decision variable:

$\begin{array}{l} F (i_{0} + 1, j_{0} + \frac{1}{2}) & = 2 Δ x (j_{0} + \frac{1}{2}) - 2 Δ y (i_{0} + 1) - 2 Δ x c \\ = \underline{2 Δ x j_{0}} + Δ x - \underline{2 Δ y i_{0}} - 2 Δ y - \underline{2 Δ x c} \\ = F (i_{0}, j_{0}) + Δ x - 2 Δ y \\ = 0 + Δ x - 2 Δ y \end{array}$

Note that, according to Equation (5.6), F(i0, j0) = 0 because (i0, j0) belongs to the line. This allows us to write a rasterization algorithm that uses integer coordinates only and simple updates of the decision variable as shown in Listing 5.2. Since Δy/Δx are related to the slope of the line we can simply set them as the width and the height of the line, respectively.

BresenhamRasterizer (i0, j0, i1, j1) {

 int Δy = j1 − j0;

 int Δx = ii − i0;

 int D = Δx − 2 * Δy;

 int i = i0;

 int j = j0;

 OutputPixel (i, j);

 while(i < i1)

 if (D >= 0) {

  i = i + 1;

  j = j + 1;

  D = D + (Δx − Δy);

  else {

 i = i + 1;

 D = D + (−Δy);

  OutputPixel(i, j);

LISTING 5.2: Bresenham rasterizer for the case of slope between 0 and 1. All the other cases can be written taking into account the symmetry of the problem.

5.1.2 Polygons (Triangles)

Rasterizing a polygon is usually referred to as polygon filling. We will not go very deep into technicalities about general polygons because usually polygons are tessellated into triangles before being rasterized.

5.1.2.1 General Polygons

The scanline algorithm fills a polygon by proceeding line by line, from the bottom to the top and from left to right with each line, outputting pixels inside the polygon, just like an inkjet printer. This is done by finding the intersections between each scanline and the edges of the polygon and sorting them on their y values (ascending). Then the interior pixels are found by the disparity test, that is: on every odd intersection we are going from outside to inside the polygon, on every even intersection we are going from inside to outside.

To make this very simple algorithm actually work we must consider the following requirement for the filling: if two polygons share the same edge, no pixel should be outputted by both, that is, there must be no overlapping.

The basic problem is due to the intersection on integer values. Referring to Figure 5.3, consider the scanline passing though vertex C and the scanline passing through E. Both vertices are in a scanline, but while pixel at C is considered part of the polygon, pixel at E is not. This is because, in the case of intersection with integer coordinates, the scanline algorithm chooses to consider the pixel inside if it is an odd intersection (entering) and outside otherwise. This guarantees that if a non-horizontal edge is shared by two polygons, all the pixels crossed by the edges will belong only to one of the two.

Figure 5.3

Figure showing scanline algorithm for polygon filling.

Scanline algorithm for polygon filling.

The same kind of choice is done for horizontal edges: note that the pixels on the horizontal edge AB are considered inside while those on FG are not. Finally, note that the intersections with the edge DH and the edge DF concide with the vertex D but must be considered separately, the first as an even/exiting intersection, the second as an odd/entering intersection. Note that this strategy would cause a problem with vertex E, which instead should be counted only as an even/exiting intersection. The problem is solved by not considering the maximum y coordinate in the intersection test, which means that the edge IE does not intersect scanline 4.

The intersection tests of the scanlines with the edges are not computed from scratch for each scanline. Instead, the same considerations used for the DDA rasterizer are used: if a segment is expressed as y = m x + b, that is, $x = (y - b) \frac{1}{m},$ it means that the intersection point moves by $\frac{1}{m}$ at every scanline.

5.1.2.2 Triangles

The modern graphics hardware are massively parallel and designed to process the pixels in groups, but if we consider the rasterization algorithms seen so far we can notice they are quite linear in that they process one pixel after the other and update quantities. The scanline algorithm for polygon filling works for every type of polygon (even non-convex or with holes) but proceeds line after line.

A more parallel approach is just to give up on incremental computation and test in a parallel group of pixels to see if they are inside or outside the polygon. As will be made clear in a moment, this test can be done efficiently for convex polygons without holes. The only thing we need is to be able to know if a point is inside a triangle or not.

A useful characteristics of the convex polygons, and hence of triangles, is that they can be seen as the intersection of a finite number of halfspaces. If, for each edge of the triangle, we take the line passing through it, one of the two halfspaces will include the inside of the triangle, as shown in Figure 5.4.

Figure 5.4

Figure showing any convex polygon can be expressed as the intersection of the halfspaces built on the polygon edges.

Any convex polygon can be expressed as the intersection of the halfspaces built on the polygon edges.

Let us consider again the implicit formulation of the line used in Equation (5.6). In this case, we can think that the line divides the 2D space in two halfspaces, one where the function F(.) is positive and one where it is negative. In principle we could just find the implicit definition for each of the three lines but then we are left with an ambiguity: how do we know, for a given line, if the points inside the triangle are those in the positive or in the negative halfspace? Just to make it clearer, note that if F is the implicit formulation of a line, −F also is.

What we need is a conventional way to define the implicit formulation that depends on the order in which we consider the two vertices defining the line. Consider the edge $\bar{v_{0} v_{1}} :$ the line passing through it can be written as:

$[\begin{matrix} x \\ y \end{matrix}] = [\begin{matrix} x_{0} \\ y_{0} \end{matrix}] + t ([\begin{matrix} x_{1} \\ y_{1} \end{matrix}] - [\begin{matrix} x_{0} \\ y_{0} \end{matrix}]) (5.11)$

so when t increases we move on the line in the direction (v1 − v0). We can solve these two equations for t, for any [x, y]T on the line (horizontal and vertical lines are not considered since they are trivial cases to treat):

$\begin{array}{l} t & = & \frac{x - x_{0}}{x_{1} - x_{0}} (5.12) \\ t & = & \frac{y - y_{0}}{y_{1} - y_{0}} \end{array}$

hence:

$\frac{x - x_{0}}{x_{1} - x_{0}} = \frac{y - y_{0}}{y_{1} - y_{0}} (5.13)$

upon which, after few simplification steps and factoring out x and y, we obtain the so called edge equation for the edge $\bar{v_{0} v_{1}} :$

$E_{01} (x, y) = Δ y x - Δ x y + (y_{0} x_{1} - y_{1} x_{0}) = 0 (5.14)$

Note that if we consider the vector made of x and y coefficients [Δy, − Δx]T we will find that it is orthogonal to the line direction [Δx, Δy]T (their dot product is 0, see Appendix B) and it indicates the halfspace where the implicit function is positive. To test that this is true, we can simply write E01 for point p (see Figure 5.5):

$E_{01} (v_{0} + [\begin{array}{r} Δ y \\ - Δ x \end{array}]) = \dots = Δ y^{2} + Δ x^{2} > 0$

Figure 5.5

Figure showing edge equation explained.

Edge equation explained.

In other words, if we walk on the line along direction v1 − v0 the positive halfspace is on our right. Note that if we write the edge equation swapping the order of the two points we will see that E10 = −E01 and, again, the positive halfspace will be on our right (walking along v0 − v1). So if we specify the triangle as three vertices v0 v1 v2 counterclockwise, it means that if we walk from v0 to v1, and then from v1 to v2, the space outside the triangle is on our right all the time. According to this, we can test if a point p is inside the triangle in this way:

$p is inside T \Leftrightarrow (E_{01} (p) \leq 0) \land (E_{12} (p) \leq 0) \land (E_{20} (p) \leq 0) (5.15)$

This means that an algorithm for the rasterization of a triangle could be just to test all the pixels of the viewport against the condition (5.15), but that would not be very efficient.

A straightforward optimization consists of reducing the space of the pixels to test those inside the smallest rectangle that include the triangle, named bounding rectangle . Another optimization consists of testing a square group of pixels, often named stamp, to test if it is entirely inside (outside) the triangle, in which case all the pixels in the stamp are surely inside (outside) the triangle. Figure 5.6 shows an example of a triangle with its bounding rectangle and some stamps: A,C and D save us the cost of testing their inside pixels individually while the stamp B is neither entirely outside nor entirely inside. In this case, we may think to have a second level of smaller stamps or we simply test all the pixels individually. The number of levels of the stamps, the size of the stamps, etc., are a matter of implementation and vary with the graphics hardware.

Figure 5.6

Figure showing optimization of inside/outside test for triangle filling.

Optimization of inside/outside test for triangle filling. Pixels outside the bounding rectangle do not need to be tested, as well as pixels inside stamp A, which are outside the triangle, and pixels inside stamps C and D, which are all inside the triangle.

5.1.3 Attribute Interpolation: Barycentric Coordinates

We already know from Section 2.2 that to each vertex we may assign attributes other than its position. For example, we may assign a color attribute to each vertex. Now the question is: which color should we assign to the pixels rasterized by the primitive? For example, if we have a line segment going from v0 to v1 and we assign the color Red to v0 and the color Blue to v1, which color will be assigned to the pixels outputted by the rasterization? One intuitive answer is that the color should fade from red to blue as we follow the rasterization of the segment from v0 to v1 . This is obtained by a linear combination of the two colors:

$c (i, j) = R e d λ_{0} (i, j) + B l u e λ_{1} (i, j) (5.16)$

where we indicate with c(i, j) the color of pixel (i, j) and with λ0(i, j) and λ1(i, j) the coefficients of the linear combination. Let us assume the position of pixel (i′, j′) is exactly middle way between v0 and v1. Then we would have no doubt that both the coefficients should be $\frac{1}{2}$ . This concept is extended to a generic point on the segment (or triangle) by means of barycentric coordinates.

Barycentric coordinates are a practical and elegant way to express points inside a simplex, as our segments and triangles are (see the definition of simplex in Section 3.2), as a function of the simplex vertex positions.

Let us see how they are found. Suppose we have a segment with no mass with two weights w0 and w1 hanging at the extremes p0 and p1, respectively. The barycenter is the point along the segment so that the segment will stay in equilibrium, as shown in Figure 5.7. We know from very basic notions of physics that the barycenter is the point where the sum of all the forces is 0, so:

$(p_{0} - b) w_{0} + (p_{1} - b) w_{1} = 0 (5.17)$

Figure 5.7

Figure showing barycentric coordinates: (Top-Left) Barycenter on a segment with two weights at the extremes. (Top-Right) Barycentric coordinates of a point inside a triangle. (Bottom-Left) Lines obtained keeping v0 constant area parallel to the opposite edge. (Bottom-Right) Barycentric coordinates as a non-orthogonal reference system.

Barycentric coordinates: (Top-Left) Barycenter on a segment with two weights at the extremes. (Top-Right) Barycentric coordinates of a point inside a triangle. (Bottom-Left) Lines obtained keeping v0 constant area parallel to the opposite edge. (Bottom-Right) Barycentric coordinates as a non-orthogonal reference system.

which means that we can find the barycenter as:

$b = p_{0} \frac{w_{1}}{w_{0} + w_{1}} + p_{1} \frac{w_{0}}{w_{0} + w_{1}} (5.18)$

which is in fact the general formula for the barycenter applied to the case of two points. Let us define

$λ = \frac{w_{0}}{w_{0} + w_{1}} (5.19)$

Note that 0 ≤ λ ≤ 1 and also that $\frac{w_{1}}{w_{0} + w_{1}} = 1 - λ,$ so that we can rewrite equation 5.18 as:

$b = p_{0} (1 - λ) + p_{1} λ (5.20)$

where λ and 1 − λ are called the barycentric coordinates of point b. We point out that we do not need any weights to find the barycentric coordinates, since they only depend on the proportion between them (if we multiply w0 and w1 by any non-zero factor b does not change).

Computing the barycentric coordinates of any point x along the segment is straightforward, since according to Equation (5.20) we can write:

$(p_{1} - p_{0}) λ = (x - p_{0})$

and hence, solving from λ by simply considering the fraction of the length of x − p0 over the whole length of the segment:

$λ = \frac{|| x - p_{0} ||}{|| p_{1} - p_{0} ||} (5.21)$

Barycentric Coordinates of a Triangle

We can extend the same reasoning to the triangle and reach a similar conclusion. Given a point x inside the triangle we can express it as a linear combination of the vertices position: $x = λ_{0} p_{0} + λ_{1} p_{1} + p_{2} λ_{2},$ where $0 \leq λ_{0}, λ_{1}, λ_{2} \leq 1$ and $λ_{0}, λ_{1}, λ_{2} = 1.$ The weight associated to the point pn is obtained as the ratio between the area of the triangle formed by the point x and the other two points of the triangle except pn and the area of the whole triangle. So, for example, for λ0:

$λ_{0} = \frac{A r e a (x, p_{1}, p_{2})}{A r e a (p_{0}, p_{1}, p_{2})} = \frac{(p_{1} - x) \times (p_{2} - x)}{(p_{1} - p_{0}) \times (p_{2} - p_{0})} (5.22)$

We can see that as x approaches p0, λ0 approaches 1, and λ1 and λ2 go to 0.

Also note that we can see the triangle itself as a non-orthogonal reference system, with origin in p0 and axis p1 − p0 and p2 − p0. In fact, if we consider that λ0 = 1 − λ1 − λ2 we can write

$x = p_{0} + λ_{1} (p_{1} - p_{0}) + λ_{2} (p_{2} - p_{0})$

where x is inside the triangles if and only if 0 ≤ λ1, λ2 ≤ 1 and λ1 + λ2 ≤ 1. This also explains the constraint λ0 + λ1 + λ2 = 1 given before. Also λ1 + λ2 is constant over all the lines parallel to the edge opposite to the vertex p0, as shown in Figure 5.7.

5.1.4 Concluding Remarks

In this section we have seen how to rasterize the primitives of which our scene is composed, such as segments and triangles. When a scene is made of many primitives it may happen that not all of them are entirely visible at the same time. This may happen for three reasons (see Figure 5.8):

Figure 5.8

Figure showing cases where primitives are not fully visible.

Cases where primitives are not fully visible.

because a primitive is partially or entirely covered by another with respect to the viewer, and then we need a Hidden Surface Removal algorithm to avoid rasterizing it;
because the primitive is partially outside the view volume, in which case we have to clip it, which means to determine which part of the primitive is inside the view volume;
because the primitive is entirely outside the view volume, in which case we should simply avoid rasterizing it.

In the next two sections we will deal with exactly these problems, how to avoid rasterizing hidden primitives and how to clip them.

5.2 Hidden Surface Removal

Many computer graphics algorithms are subdivided into object space and image space, depending on whether they work on the 3D space where vertices are defined or on the raster domain after the primitives are rasterized. In the case of hidden surface removal, the most well-known object-space algorithm is called depth sorting, which is an improvement on the painter’s algorithm. The idea is that the primitives are sorted on the basis of their distance from the viewer and then drawn from the farthest to the nearest (which is referred to as back-to-front drawing), as is done in some painting techniques, where the painter paints the background and then draws on it the objects starting from the far ones and proceeding with the ones closer to the viewer. In this manner, if more primitives overlap on a pixel the nearest will be rasterized as the last one and will determine the pixel value.

5.2.1 Depth Sort

When the painter’s algorithm was first proposed, the distance of a primitive from the viewer was taken as the distance of its barycenter from the viewer. The problem is that if the barycenter of a primitive A is closer to the viewer than the barycenter of another polygon B, this does not imply that the whole of A is closer than the whole of B. The depth sort algorithm attempts to find a correct back-to-front order by looking for a separating plane between couples of primitives. As the name suggests, a separating plane is a plane that separates entities (primitives in this case) into two groups: one in the positive halfspace and one in the negative halfspace. Consider a separating plane with a normal having positive z (in view space, that is, pointing towards the viewer) as in the example in Figure 5.9. We can safely say that the primitive in the negative halfspace is on the back of the primitive that is in the positive halfspace and hence must be drawn first.

Figure 5.9

Figure showing (a) Depth sort example on four segments and a few examples of planes separating them. Note that C and D cannot be separated by a plane aligned with the axis but they are by the plane lying on C. D and E intersect and cannot be ordered without splitting them. (b) A case where, although no intersections exist, the primitives cannot be ordered.

(a) Depth sort example on four segments and a few examples of planes separating them. Note that C and D cannot be separated by a plane aligned with the axis but they are by the plane lying on C. D and E intersect and cannot be ordered without splitting them. (b) A case where, although no intersections exist, the primitives cannot be ordered.

The depth sort algorithm simply tries to sort the primitives using the separating plane to establish the order between each couple. For example, if the nearest vertex of B is farther away than the farthest vertex of A, then any plane parallel to XY and with z between the two values is a separating plane. If this is not the case, the same test can be done for the x(y) values to find a separating plane parallel to YZ (XZ). In essence, we are doing a series of tests to look for a separating plane, and we start with planes parallel to the principal planes because the tests are simply comparisons of one coordinate. Note that for a couple of non-intersecting convex polygons one of the planes that contains one of the primitives is the separating plane.

Unfortunately, if the primitives intersect they cannot be ordered and, in general, it is not possible to draw first one and then the other and get the correct result. Even if no intersections exist, it may be not possible to find a correct back-to-front order. Figure 5.9(b) shows a case where this happens: a part of A is behind B, a part of B is behind C, and a part of C is behind A.

5.2.2 Scanline

Note that the cyclic ordering of Figure 5.9(b) would never happen in the bidimensional case; that is, for segments in the plane. The scanline algorithm rasterizes the scene line-by-line along the y axis, therefore each iteration concerns the intersection of a plane y = k with the scene. The result of a scanline pass produces a set of segments whose vertices are at the intersection of the plane with the edges of the primitives. In this manner the problem is scaled down by one dimension and it becomes a matter of correctly rendering a set of segments.

If we project the vertices on the current scanline, we obtain a series of spans with the following obvious property: in each span the back-to-front order does not change (see Figure 5.10). So, for each span, we find out the nearest portion of segment and rasterize it until the end of the span.

Figure 5.10

Figure showing (a) Step of the scanline algorithm for a given plane. (b) The corresponding spans created.

(a) Step of the scanline algorithm for a given plane. (b) The corresponding spans created.

The scanline may also be used for intersecting polygons, although we need a more complicated approach to define the spans (and which is also beyond our interest for this technique).

Although both depth sort and scanline approaches are widely known in the CG community, they are not part of the rasterization-based pipeline, for the simple reason that they do not fit the pipeline architecture well, where primitives are processed one-by-one through the stages of the pipeline. Instead, these algorithms need to store all the primitives to be rendered and order them before drawing. Furthermore, they are not well suited for implementation on parallel machines, like current GPUs are.

5.2.3 z-Buffer

The z-Buffer is a de facto standard solution for hidden surface removal. We explained in Section 5.1 that when a primitive is rasterized, a number of interpolated attributes are written in the output buffers. One of the attributes is the z value of the vertex coordinate, which is written in the buffer called z-buffer (or depth buffer). The algorithm is very simple and it is shown in Listing 5.3. At the beginning the z-buffer is initialized with the maximum possible value (line 2), which is the value of the far plane of the view frustum (note that the z-buffer operates in NDC and hence this value is 1). Then each primitive is rasterized and for each pixel covered (line 4) its z value is calculated. If this value is smaller than the value currently written in the z-buffer, the latter is replaced with the new value (line 6).

ZBufferAlgorithm() {

 forall i, j ZBuffer[i, j] = 1.0;

 for each primitive pr

 for each pixel (i, j) covered by pr

  if(Z[i, j] < ZBuffer[i, j])

   ZBuffer[i, j] = Z[i, j];

LISTING 5.3: The z-buffer algorithm.

The advantages of the z-buffer in the rasterization-based pipeline are many. The primitives are processed as they are sent to the pipeline and no previous ordering operations are needed. The computation is per-pixel and adjacent pixels can be processed independently and in parallel.

5.2.4 z-Buffer Precision and z-Fighting

One issue that must be considered carefully is the depth buffer granularity, that is, how well close values of z are distinguished. Simply put, the problem is that the projection and the rasterization may not preserve the original relation among depth values: it may happen that two values za and zb where za < zb are stored as the same value in the depth buffer. This is simply because of finite numerical precision. Let us see how this may happen.

Vertex coordinates are transformed from view space to NDC and then the z coordinates are remapped from [−1, 1] to [0, 1] to be compared with the value in the depth buffer and possibly written. Let us call this mapping

f(z) : view space → depth buffer space

Figure 5.11

Figure showing state of the depth buffer during the rasterization of three triangles (the ones shown in Figure 5.9(b)). On each pixel is indicated the value of the depth buffer in [0,1]. The numbers in cyan indicate depth values that have been updated after the last triangle was drawn.

State of the depth buffer during the rasterization of three triangles (the ones shown in Figure 5.9(b)). On each pixel is indicated the value of the depth buffer in [0,1]. The numbers in cyan indicate depth values that have been updated after the last triangle was drawn.

If the projection is orthogonal (see Section 4.6.3.1) we have:

$f_{0} (z) = \frac{1}{2} (z \frac{- 2}{f - n} - \frac{f + n}{f - n} + 1) (5.23)$

So the mapping is of the form fo(z) = az + b, that is, just a scaling and a translation. Note that typically the depth buffer stores values in fixed-point representation. In a fixed-point representation the number is simply an integer with a common denominator. For example, the fixed-point 16-bits numbers between 0 and 1 are i/(216 − 1) : i = 0…216 − 1, which means that all the consecutive numbers represented are at the same distance, (216 − 1)−1. Conversely, coordinates are expressed in floating-point numbers, which are distributed more densely around 0. This means that if we have a 32-bit depth buffer, there will be multiple depth values in view space that, when transformed, will map to the same value in depth buffer space. Figure 5.12 shows an example of artifacts we see. The cyan truncated cone is closer but because of this approximation some fragments lost the contest. This problem is called z-fighting.

Figure 5.12

Figure showing two truncated cones, one white and one cyan, superimposed with a small translation so that the cyan one is closer to the observer. However, because of z-buffer numerical approximation, part of the fragments of the cyan cones are not drawn due to the depth test against those of the white one.

Two truncated cones, one white and one cyan, superimposed with a small translation so that the cyan one is closer to the observer. However, because of z-buffer numerical approximation, part of the fragments of the cyan cones are not drawn due to the depth test against those of the white one.

Things get worse for perspective projection, for which we have:

$f_{p} (z) = \frac{1}{2} (\frac{f + n}{f - n} - \frac{1}{z} \frac{2 f n}{f - n} + 1) = (\frac{1}{2} \frac{f + n}{f - n} + \frac{1}{2}) - \frac{f n}{z (f - n)} (5.24)$

Unlike for orthogonal projection, this time the depth value is proportional to the reciprocal of the z value in view space. Figure 5.13 shows a plot where the abscissa corresponds to the value of z from n to f and the ordinate to the value in depth buffer space for fp(z) and fo(z). We may see how with perspective projection, the first 30% of the interval maps on the 80% of the interval in depth space, that is, values close to the near plane n are mapped more uniformly in depth buffer space. Therefore, the farther away from the near plane, the less the precision.

Figure 5.13

Figure showing a plot showing the mapping between z-values in view space and depth buffer space.

A plot showing the mapping between z-values in view space and depth buffer space.

5.3 From Fragments to Pixels

After rasterization we are in the domain of fragments, which are integer coordinates with color and z-value attached. Before they become pixels on the screen, some per-fragment operations, included in any graphics API and implemented in many rasterization-based pipelines, are applied.

5.3.1 Discard Tests

These tests do not modify the fragment values, only decide whether to discard it or let it pass to the next stage. We will introduce them in the order they are found along the pipeline.

The scissor test discards any fragment whose coordinates fall outside a specified scissor rectangle. Suppose that we want to show some scrolling text on a bar at the bottom of the screen, while rendering our complex and time-consuming 3D scene on the rest. A moving text is a very simple rendering, but in order to animate it we would need to redraw the whole scene even if not needed. With the scissor test we can select the only rectangle of the screen that will be affected by rendering, treating it like a screen inside the screen.

The scissor test is somewhat limited because we can only specify a rectangular region. Suppose we want a point of view fixed inside the car as shown in Figure 5.14, an option present in many racing games. In this case we would like to draw in a region whose shape is not simply a rectangle. The stencil buffer is a buffer of integer values that can be written during rasterization and used to mask out fragments. In the example above, which is a classic usage pattern, we would first render the interior of the car and set to 1, in the stencil buffer, all the pixels involved in the rasterization. Then we would render the rest of the scene enabling the stencil test to discard all the fragments whose corresponding position in the stencil buffer contains 1.

Figure 5.14

Figure showing stenciling example: (Left) The rendering from inside the car. (Middle) The stencil mask, that is, the portion of screen that does not need to be redrawn. (Right) The portion that is affected by rendering.

Stenciling example: (Left) The rendering from inside the car. (Middle) The stencil mask, that is, the portion of screen that does not need to be redrawn. (Right) The portion that is affected by rendering.

The next test is the depth test, the HSR removal solution we just explained in Section 5.2.3. Note that the stencil and the depth test are very similar: both use a buffer that is read, written and then tested to decide whether to discard or keep the fragment. Also note that we do not need a stencil buffer to implement the view from inside the car: if at each frame we render both the interior and then the rest of the scene we obtain the same result. The difference is in efficiency: using stenciling, we only draw the interior once and avoid executing the z-buffer algorithm (reading from the z-buffer, comparing and writing) for all those fragments that would never be drawn anyway.

5.3.2 Blending

When a fragment passes the z-test, its color can be written into the color buffer. Until now, we assumed that the color already present is replaced by the color of the fragment, but this is not necessarily the case: the color of the fragment (referred to as source color) can be blended with that in the color buffer (referred to as destination color) to model things such as transparent surfaces. A key role in blending is played by the alpha component, which is used to specify the opacity of a surface, that is, how much light passes through it.

The CG APIs such as WebGL offer the possibility of making a linear combination of each color component to obtain a wide range of effects. More precisely, let:

$\begin{matrix} s = [s_{R}, s_{G}, s_{B}, s_{a}] \\ d = [d_{R}, d_{G}, d_{B}, d_{a}] \end{matrix}$

s be the source and d the destination color; the new destination color (d′) is computed as:

$d^{'} = \underset{b}{\underset{︸}{[\begin{matrix} b_{R} \\ b_{G} \\ b_{B} \\ b_{A} \end{matrix}]}} [\begin{matrix} s_{R} \\ s_{G} \\ s_{B} \\ s_{a} \end{matrix}] + \underset{c}{\underset{︸}{[\begin{matrix} c_{R} \\ c_{G} \\ c_{B} \\ c_{A} \end{matrix}]}} [\begin{matrix} d_{R} \\ d_{G} \\ d_{B} \\ d_{a} \end{matrix}] (5.25)$

where b and с are coefficients, referred to as blending factors that can be set to produce different visual effects. The resulting value is clamped in the interval in which the color is encoded. The most common uses of blending are handling transparent surfaces and improving antialiasing.

Figure 5.15

Figure showing results of back-to-front rendering of four polygons. A and C have α = 0.5, B and D have α = 1, and the order, from the closest to the farthest, is A,B,C,D.

Results of back-to-front rendering of four polygons. A and C have α = 0.5, B and D have α = 1, and the order, from the closest to the farthest, is A,B,C,D.

5.3.2.1 Blending for Transparent Surfaces

The algorithms for hidden surface removal we have seen in Section 5.2 are all based on the assumption that if one surface is in front of another, the latter is hidden from view, that is, we cannot see through things. Thanks to this assumption we can use the depth buffer technique and do not worry about the order in which the primitives are rendered. This is no more true if the occluding surface is not opaque.

The most common way to handle transparent surfaces consists of rendering the primitives in back-to-front order, that is, from the farthest to the closest, and using blending for showing the contribution of non-opaque surfaces with blending factors b = [sa, sa, sa, sa] and c = [1 − sa, 1 − sa, 1 − sa, 1 − sa]. The effect is that the surface being rendered contributes to the color proportionally to its opacity (for sa = 1 it simply replaces the color currently in the color buffer). This simple setting is very effective. Note that the destination α value is not used.

5.3.3 Aliasing and Antialiasing

When we rasterize a segment or a polygon, we take a real domain and break it into discrete parts, the pixels. If we go back to Figure 5.1 and look at the two segments, we may agree that the pixels we use to represent them are the closest approximation we can find, but if you had no previous knowledge that those pixels are supposed to be segments, you probably would never identify them as two segments, because they just look like a series of adjacent pixels. Another way to say it is that there are infinite segments that would be rasterized, which pass on the very same set of pixels. This ambiguity is what we call the phenomenon of aliasing. The visual look of this effect is that segments and borders of polygons look jagged and are not quite believable as such.

Aliasing is intrinsic to the discretization, but there are techniques to make it less perceivable. Let us consider line rasterization and assume that the width of the line is one pixel, which is the minimum size we can handle with our raster device. So, the rasterizer computes a set of pixels to represent the line. However, we can still play with the color of the surrounding pixels to reduce the aliasing effect. If we consider the segment as a polygon having width 1 pixel and length equal to the segment’s length, we see that such a polygon partially covers some pixels. The area averaging technique or unweighted area sampling consists of shading each pixel intersected by this polygon with a color whose intensity is scaled down by the percentage of the area of the pixel covered by the polygon, as shown in Figure 5.16. The intuition behind this technique is that if only half a pixel is inside the line, we halve its color and, at a proper viewing distance, we will obtain a convincing result. Considering the color space HSV (hue, saturation and value), what happens is that the saturation is scaled down while H and V remain the same.

Figure 5.16

Figure showing (Top-Left) A detail of a line rasterized with DDA rasterization. (Top-Right) The same line with the Average Area antialiasing. (Bottom) Results.

(Top-Left) A detail of a line rasterized with DDA rasterization. (Top-Right) The same line with the Average Area antialiasing. (Bottom) Results.

However, what about the color of the other half of the pixel? What if more lines and polygons cover the same pixel? If we use blending, we can composite our half color by using the alpha channel with the color already in the color buffer.

To implement this technique we could just compute the area of intersection between the thick line and the pixels during the rasterization process and output the color depending on the area value. We can try to do this as an exercise by passing the segments’ endpoints in the fragment shader and making the computation. However, the exact intersection between a polygon and a square (the pixel) requires several floating point operations and we want to avoid that.

An alternative implementation consists of using super sampling, that is, considering each pixel as a grid of smaller pixels (for example a 2 × 2 grid) and approximating the area of the pixel covered with the number of these smaller pixels covered by the line. A finer grid leads to a better approximation but requires more computation. Super sampling can also be applied to the whole scene by full-screen anti-aliasing (FSSA). In its naive implementation, the rendering is performed with a higher resolution and then converted to the desired resolution by averaging the fragments values obtained.

5.3.4 Upgrade Your Client: View from Driver Perspective

In principle we could implement the view from the driver just by placing the view frame on the driver’s head, in the assumption that the 3D model of the car also includes the cockpit.

It is common in many games where the character is inside something (a car, plane, starship, robot, helmet, etc.) that this something does not move with respect to the point of view of the player. In the case of the car, for example, the cockpit will always be occupying the same portion of the screen. So let us suppose we render the cockpit normally, that is, as a 3D model. The vertices of cockpit will go through all the transformations of the pipeline (object space → world space → view space → clip space → NDC space) to finally produce the same image always. Moreover, we need to model all of the car’s interior.

In its simplicity, this is something of a crucial point in rendering: the fact that we want to produce an image of a 3D scene, it does not mean that we have to actually model all the scene in three dimensions. We will explore this concept in detail in Chapter 9; for now we only want to say that we do not need to model the cockpit and then render it if in the end we just need one specific image of it. Instead, let us assume that we have placed a glass screen on the near plane of our view frustum. What we are going to do is to paint the inside of the car as seen by the driver directly on the glass. In more practical terms, we set the view transformation as the identity and draw the polygons expressing the coordinates directly in NDC space (that is, in [−1, 1]3) as shown in Figure 5.17. This way we simply skip all the transformations and draw all the interior as an overlay on everything outside.

130  gl.enable(gl.STENCIL_TEST);

131  gl.clearStencil (0);

132  gl.stencilMask (~0);

133  gl.stencilFunc(gl.ALWAYS, 1, 0xFF);

134  gl.stencil0p(gl.REPLACE, gl.REPLACE, gl.REPLACE);

136  gl.useProgram(this.perVertexColorShader);

137  gl.uniformMatrix4fv(this.perVertexColorShader.uModelViewMatrixLocation, false, SglMat4.identity ());

138  gl.uniformMatrix4fv(this.perVertexColorShader.uProjectionMatrixLocation, false, SglMat4.identity ());

139  this.drawColoredObject(gl,this.cabin, [0.4, 0.8, 0.9, 1.0]);

141  gl.stencilFunc(gl.EQUAL, 0, 0xFF);

142  gl.stencil0p(gl.KEEP, gl.KEEP, gl.KEEP);

143  gl.stencilMask (0);

LISTING 5.4: Using stenciling for drawing the cabin.

Figure 5.17

Figure showing exemplifying drawings for the cabin. The coordinates are expressed in clip space. (See code at http://envymycarbook.com/chapter5/0/cabin.js.)

Exemplifying drawings for the cabin. The coordinates are expressed in clip space. (See code at http://envymycarbook.com/chapter5/0/cabin.js.)

The changes to our client to include the view from inside the car are very few. First we add a new camera that we call DriverCamera placed inside the car, which is easily done as in Section 4.10.1. Then we need to introduce the code shown in Listing 5.4 at the beginning of the rendering cycle. These few lines simply prepare the stencil buffer by drawing the polygons shown in Figure 5.17 (on the left) and thus inhibit the writing in the frame buffer of the pixels covered by such polygons.

With gl.clearStencil(0) we tell WebGl that the value to use to clear the stencil buffer is 0 and with gl.stencilMask(~0) we indicate that all bits of the stencil are enabled for writing (these are also the default values anyway).

How the stencil test behaves is determined by the calls at lines 133-134. Function gl.stencilFunc(func,ref,mask) sets the condition for discarding a fragment as the outcome of the comparison between the value already in the stencil buffer; and the reference value ref.func specifies the kind of test; and mask is a bit mask that is applied to both values. In this very simple example by passing gl.ALWAYS we say that all fragments will pass no matter what the stencil, the reference and the mask values are.

Function gl.stencilOp(sfail,dpfail,dppass) tells how to update the stencil buffer depending on the outcome of the stencil test and the depth test. Since we draw the polygon with the depth buffer cleared and make all the fragments pass, the only meaningful value is dppass, that is, what to do when both stencil and depth test pass. By setting this parameter to gl.REPLACE we tell WebGL to replace the current value on the stencil with the reference value specified in gl.stencilFunc (1 in this example). After the drawing (lines 136-139) we will have a stencil buffer that contains 1 in all the positions covered by the drawing and 0 elsewhere. This is our mask.

Then, at lines (141-143) we change the way to do the stencil test so that only those fragments for which the value in the stencil buffer is 0 pass and that no modification is done to the stencil buffer in any case. The final result is that no subsequent fragments will pass the stencil test for the pixels rasterized by the cockpit.

We also want to add the windshield with a partially opaque upper band (see Figure 5.17 on the right), that is, we want to see through a dark windshield. We can use blending by adding the code shown in Listing 5.5 at the end of function drawScene.

The function gl.blendFunction determines how the color of the fragment is computed, that is, the blending factors introduced in Section 5.3.2 and used in Formula (5.25). Again, we specified the coordinates of the windshield in clip space.

188  gl.enable(gl.BLEND);

189  gl.blendFunc(gl.SRC_ALPHA, gl.ONE_MINUS_SRC_ALPHA);

190  gl.useProgram(this.perVertexColorShader);

191  gl.uniformMatrix4fv(this.perVertexColorShader.uModelViewMatrixLocation, false, SglMat4.identity ());

192  gl.uniformMatrix4fv(this.perVertexColorShader.uProjectionLocation, false, SglMat4.identity ());

193  this.drawColoredObject(gl, this.windshield, [0.4, 0.8, 0.9, 1.0]);

194  gl.disable(gl.BLEND);

LISTING 5.5: Using blending for drawing a partially opaque windshield.

Figure 5.18 shows a snapshot from the client showing the driver’s point of view.

Figure 5.18

Figure showing adding the view from inside. Blending is used for the upper part of the windshield. (http://envymycarbook.com/chapter5/0/0.html.)

Adding the view from inside. Blending is used for the upper part of the windshield. (See client http://envymycarbook.com/chapter5/0/0.html.)

5.4 Clipping

A clipping algorithm takes as input a primitive and returns the portion of it inside the view frustum. Clipping is not strictly necessary to produce a correct output. We could simply rasterize the whole primitive and ignore the fragments outside the viewport, but rasterization has a cost.

Clipping is done in clip space (which we introduced in Section 4.6.3.1) so the problem is posed as finding the portion of a primitive (that is, a line segment or a polygon) inside the cube [xmin, ymin, zmin] × [xmax, ymax, zmax] with as few operations as possible.

5.4.1 Clipping Segments

We start with the Cohen-Sutherland algorithm for the bidimensional case. The idea is to perform a test that only requires a few operations to check if the segment is fully outside or fully inside the view volume. If it is fully inside, the output is the segment itself, and if it is fully outside, the output is empty (the segment is discarded).

As shown in Figure 5.19, we can see the view volume as the intersection of four half spaces defined by the planes, p+x, p−x, p+y and p−y. The algorithm finds in which part of each plane the end points are to check if one of the planes is a separating plane for the segment and the view volume or if the segment is entirely inside the view volume. Let us call p0 and p1 the two end points of any of the segments in the figure (which end point is p0 and which is p1 is not important here) For example, if p0, x > xmax and p1, x > xmax then p+x is a separating plane (segments A and B). Suppose we define a function R(p) returning a four-digit binary code b+yb−yb+xb−x where each bit corresponds to one of the planes and it is 1 if p lies in its positive halfspace:

$\begin{array}{l} b_{+ x} (p) & = & {\begin{array}{l} 1 & p_{x} > x_{m a x} \\ 0 & p_{x} < = x_{m a x} \end{array} & b_{- x} (p) = & {\begin{array}{l} 1 & p_{x} < x_{m i n} \\ 0 & p_{x} > = x_{m i n} \end{array} \\ b_{+ y} (p) & = & {\begin{array}{l} 1 & p_{x} > y_{m a x} \\ 0 & p_{x} < = y_{m a x} \end{array} & b_{- x} (p) = & {\begin{array}{l} 1 & p_{x} < y_{m i n} \\ 0 & p_{x} > = y_{m i n} \end{array} \end{array}$

Figure 5.19

Figure showing scheme for the Cohen-Sutherland clipping algorithm.

Scheme for the Cohen-Sutherland clipping algorithm.

Note that if p0 and p1 lie in the positive halfspace of any of the four planes, and so such a plane is a separating plane, the corresponding bit will be 1 for both endpoints. Therefore we have a simple test to determine if one of the planes is a separating plane, which is:

$R (p_{0}) & R (p_{1}) \neq 0$

where & is the bitwise AND operator. Also, we have another test to check if the segment is fully inside the view volume

$R (p_{0}) | R (p_{1}) = 0$

where | is the bitwise OR operator. This means (R(p0) =0) ∧ (R(p1) = 0).

If these two tests fail some intersection computation between the segment and the planes will need to be done.

If R(p0) = 0 ∨ R(p1) ≠ 0 (or vice versa) then one endpoint is inside and the other is outside and one intersection has to be computed. Otherwise both endpoints are outside the view volume but none of the four planes is a separating plane. In this case the segment may be intersecting, like segment E, or not, like segment D, so the algorithm computes the intersection of the segment with one of the planes and the test is executed again with the new endpoints.

The Liang-Barsky algorithm avoids making divisions by working with a parametric definition of the segment. The points on the segment (p0, p1) may be written as:

$s (t) = p_{0} + t (p_{1} - p_{0}) = p_{0} + t v (5.26)$

As shown in Figure 5.20, s(t) is a point in the segment for t in the interval [0, 1], if t < 0 s(t) is outside the segment on the side of p0 and if t > 1 s(t) is outside the segment on the side of p1. We can easily calculate the values of t for which the line intersects each of the plane’s boundaries. For example for plane pl0(x = xmin) we have:

${(p_{0} + v t)}_{x} = x_{m i n} \Rightarrow t = \frac{x_{m i n} - p_{0 x}}{v_{x}}$

Figure 5.20

Figure showing scheme for the Liang-Barsky clipping algorithm.

Scheme for the Liang-Barsky clipping algorithm.

So we can replace t in Equation (5.26) to find the intersection tk with each of the planes plk. Now consider walking along the line in the direction indicated by v, starting from the smallest t to the largest. We can label each tk as entry point if we move from the positive halfspace of plane plk to the negative, exit otherwise:

$\begin{array}{l} t_{0} is an entry if v_{x} > 0, exit otherwise \\ t_{1} is an entry if v_{x} < 0, exit otherwise \\ t_{2} is an entry if v_{y} > 0, exit otherwise \\ t_{3} is an entry if v_{y} < 0, exit otherwise \end{array}$

The extremes of our clipped segment are the last one entering the clip rectangle and belonging to the segment and the first one leaving the clip rectangle and belonging to the segment:

$\begin{array}{l} t_{m i n} & = & max (0, largest entry value) \\ t_{m a x} & = & min (1, smallest exit value) \end{array}$

Note that if tmin > tmax this means there is no intersection, otherwise the new extremes are $p_{0}^{'} = s (t_{m i n})$ and $p_{1}^{'} = s (t_{m a x}) .$

5.4.2 Clipping Polygons

The most well-known algorithm for clipping a polygon P against a convex polygon, which in our case is a rectangle, is the Sutherland-Hodgman algorithm. The idea is that we use a pair of scissors to cut the polygon P along the edge of the clipping polygon. If the clipping polygon has k edges, after k cuts we have our clipped version of P (see Figure 5.21). Making a cut means clipping a polygon against a halfspace, which is the basic step of this algorithm.

Figure 5.21

Figure showing sutherland-Hodgman algorithm. Clipping a polygon against a rectangle is done by clipping on its four edges.

Sutherland-Hodgman algorithm. Clipping a polygon against a rectangle is done by clipping on its four edges.

We recall that a polygon is specified as a list of n vertices where each consecutive vertex plus the last and the first define a segment. The algorithm for clipping a polygon against a halfspace plk− starts from a vertex of P and walks along the polygon edges (pi, pi+1). If pi is in the negative halfspace (inside), then it is added to the output polygon. If the segment intersects the plane, the intersection point is also added.

5.5 Culling

A primitive is culled when there is no chance that any part of it will be visible. This may happen in three cases:

Back-face culling: when it is oriented away from the viewer.
Frustum culling: when it is entirely outside the view volume.
Occlusion culling: when it is not visible because it is occluded by other primitives (referred to as occluder).

As for clipping, we do not need culling for obtaining a correct result, we need it for eliminating unnecessary computation, which means to avoid rasterization of primitives that, at the end, will not determine any of the final pixel.

5.5.1 Back-Face Culling

When we use polygons to describe objects, we can choose whether only their front side or also their back side can be seen. If we have modeled a flag or a piece of paper or any other object so thin that we consider it bidimensional, then we want to see both sides of the polygons. On the other hand, if we modeled the surface of a tridimensional object (as we do most of the time) usually, it is not possible to see the back of the polygons, so what is the point of rasterizing them? Back-face culling is a filter that discards polygons that are facing away from the point of view. Does it matter that much? Yes — consider that if we orbit around an object with back-face culling enabled, at the end we avoid rasterizing 50% of the faces, on average.

Determining if a polygon is facing away from the viewer is simple and consists of checking if the angle between the normal of the polygon and the line passing through the point of view and any point on the polygon (for example, one of its vertices) is greater that $\frac{π}{2}$ , which can be done by checking if the dot product is positive (c − p)n > 0 (see Figure 5.22). Note that here the normal of the polygon is the vector orthogonal to the polygon surface and not the interpolated normal attribute. One may be tempted to replace (p − c) with the view direction (in order to avoid the subtraction) but consider that this leads to incorrect results for perspective projection (while it would be correct for a parallel projection). To ensure this, the rasterization-based pipeline performs the back-face culling after the projection, in the canonical viewing volume. Thus, the test becomes checking the sign of the z component of the polygon’s normal.

Figure 5.22

Figure showing (a) If a normal points toward −z in view space this does not imply that it does the same in clip space. (b) The projection of the vertices on the image plane is counter-clockwise if and only if the triangle is front-facing.

(a) If a normal points toward −z in view space this does not imply that it does the same in clip space. (b) The projection of the vertices on the image plane is counter-clockwise if and only if the triangle is front-facing.

Let us consider the case of the triangle T. Its normal direction is:

$N (T) = (p_{1} - p_{0}) \times (p_{2} - p_{1}) (5.27)$

where $|| N (T) || = 2 A r e a (T)$ (see Appendix B). If we project the point on the view plane we can write:

$N {(T)}_{z} = (p_{1 x y} - p_{0 x y}) \times (p_{2 x y} - p_{1 x y}) = 2 A r e a (T_{x y})$

that is, the double area of the projection of T on the view plane xy. So if Area(Txy) is negative, it means N(T)z is negative and hence the triangle is facing away, while if it is positive it is facing the viewer.

You may find mentioned elsewhere that back-face culling works only for convex objects. This needs more clarification: even if the object is not convex, faces are culled correctly. What can happen for non-convex objects is that there can be other faces that are front-facing, but are not visible because they are occluded by other parts of the surface. So, we would have unnecessary rasterization computation. Another common misunderstanding is that the object has to be watertight for back-face culling to produce correct results. Watertight means that if we could fill the inside of the object with water none would spill out, because there are no holes. This is a sufficient but not a necessary assumption: it is enough that there is no configuration (for example, viewer position, orientation, application conditions) where the user can possibly see the back of the surface. A straightforward example is the terrain and the street of our circuit: it is not a watertight surface but we can use back-face culling since we do not want to allow the point of view to go underground.

5.5.2 Frustum Culling

So far, all the algorithms we have seen work on a per-primitive basis and are hard-wired in the fixed functionality pipeline. However, in many cases we can save a lot of computation by making choices at a higher level. For example, if we are in front of a building made of 10,000 triangles but we are looking so that the building is behind us, we do not really need to check every single triangle to find out that they are all outside the view frustum. The same happens if there is a big wall between us and the building.

In this case, it comes into play a very common tool of CG: the bounding volume. A bounding volume for a set of geometric objects is a solid that encloses them all. For example, we could have a large cuboid (generally called a box) enclosing a whole building. Then, in the case of frustum culling, we could test the box against the view frustum and if there is no intersection among the two, we can conclude that there will be no intersection between the frustum and the primitives inside the box.

The choice between the types of volume to use as bounding volume is always a tradeoff between two factors:

how tightly it encloses the set of primitives
how easy it is to test its intersection against the view frustum

Let us consider the sphere. The sphere is an easy volume to test for intersection, but if we use it for a street lamp, as shown in Figure 5.23, we will have a lot of empty space. This means that it will often happen that the sphere intersects the view frustum but the lamp does not. A box would make a tighter bounding volume than the sphere, but the intersection test between a freely oriented box and the frustum requires more operations than for a sphere. What can we do when the bounding box intersects the view frustum? Either we go ahead and process all the primitives within or we use a divide et impera solution. This solution consists of using smaller bounding volumes that, all together, enclose all the primitives contained in the original bounding volume. This can be done recursively creating a hierarchy of bounding volumes. Figure 5.24 shows an example of hierarchy of bounding boxes for a model of a car.

Figure 5.23

Figure showing (Left) A bounding sphere for a street lamp: easy to test for intersection but with high chances of false positives. (Right) A bounding box for a street lamp: in this case we have little empty space but we need more operations to test the intersection.

(Left) A bounding sphere for a street lamp: easy to test for intersection but with high chances of false positives. (Right) A bounding box for a street lamp: in this case we have little empty space but we need more operations to test the intersection.

Figure 5.24

Example showing of a two-level hierarchy of Axis-Aligned Bounding Boxes for a model of a car, obtained by slitting the bounding box along two axes.

Example of a two-level hierarchy of Axis-Aligned Bounding Boxes for a model of a car, obtained by slitting the bounding box along two axes.

The type of hierarchy with the strategy to handle it are other choices that, along with the type of bounding volumes, gave rise to hundreds of algorithms and data structures to do this task. This topic is well beyond the scope of this book. Anyway, we will come back to this concept, providing some more details, in Chapter 11, when we discuss efficient implementation of ray tracing.

Note that frustum culling is just one specific use of the hierarchies of bounding volumes. They are in general used for speeding the process of testing if two moving objects collide (collision detection), in which parts they do (contact determination) so to compute the correct (or physically plausible) behavior (collision handling).

5.5.3 Occlusion Culling

As stated, occlusion culling deals with visibility ; the idea is to avoid rasterizing primitives that are not visible because they are occluded.

Several solutions for occlusion culling are context specific. If the viewer is inside a room with a window and a door, then the outside space will only be visible through the window and the door. In this case we can precompute the potentially visible set (PVS), that is, the set of primitives that may be visible when the point of view is located inside the room. A similar example could also be implemented in our game, by building a PVS for each part of the track and using it to avoid rasterizing all the items of the track every time (as we do now).

A well-known algorithm for implementing occlusion culling in the general case is hierarchical visibility, which uses the same idea of the bounding volumes for frustum culling, that is: if a set of primitives is enclosed in a bounding volume and the bounding volume itself is occluded, then none of what is inside can be visible. We point out that, in general, the difference between occlusion culling and frustum culling is simply that we replaced is inside the view frustum with is visible.

In general, an efficient algorithm for occlusion culling may give a dramatic speedup to the rendering of the whole 3D scene. How much this speedup is depends mainly on the depth complexity of the scene, which means how many primitives overlap on the same pixels. Note that with the z-buffer algorithm each and every primitive in the frustum is rasterized, even if it is completely occluded from the viewer, because the conflict is resolved at pixel level.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 5: Turning Vertices into Pixels

Create new playlist

Sign In

Sign Up