Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 10

Advanced Techniques

In this chapter we will cover some advanced techniques to add some fancy effect to our client, mostly using screen space techiques. With this term we refer to those techniques that work by rendering the scene, performing some processing on the generated image and then using it to compose the output result.

As practical examples we will see how to simulate the out-of-focus and motion effects of the photo-camera and how to add some more advanced shadowing effects, but, more than that, we will see the basic concepts and tools for implementing these sort of techniques.

10.1 Image Processing

Signal processing is a set of mathematical tools and algorithms aimed at analyzing and processing signals of many natures, such as audio, images and videos. With the advent of digital computers, many signal processing techniques developed in the analog domain have been turned into algorithms and led to the development of the modern digital signal processing. Digital image processing, often referred to simply as image processing, is about all the algorithms, which, given a digital image in input, elaborate it in order to produce an image with different characteristics or a set of related information/symbols. These processes are oriented to many different goals such as image enhancement, for the improvement of the quality of the image (e.g., noise removal, sharpness increase), image restoration, to recover some corrupted parts or colors of the input image and image compression, to reduce the amount of data of the input image while preserving its appearance, just to name a few.

Image processing was traditionally thought of as a tool for computer vision. The goal of computer vision is to analyze an image or a video and extract the information that contributed to its formation, such as the light sources involved, the camera position and orientation, etc. In this sense computer vision can be seen as the inverse of computer graphics. Computer vision is not limited to the extraction of that type of information but it concerns also image interpretations, such as the automatic labeling of the object depicted in the image for image search applications.

Since the advent of programmable and parallel GPUs, many image processing algorithms can be executed in a very short time, and therefore image processing also became a tool for computer graphics (and these are the cases we will treat in this chapter). Figure 10.1 schematizes these relationships.

Figure 10.1

Figure showing computer graphics, computer vision and image processing are often interconnected.

Computer graphics, computer vision and image processing are often interconnected.

Since the introduction of programmable shading hardware, the modern GPU has become a tool to make computations that are not necessarily related to render a 3D scene. With a somehow drastic statement, we can say that if the first graphics accelerators were circuitry for speeding up some part of the graphics pipeline, modern GPUs are essentially parallel multicore processing units that almost incidentally are used for computer graphics. Of course this is a bit too much of a sentence, because it is true that the architectures are still tailored for graphics computation (the interprocessor communication, the memory model, etc.) but it is a fact that GPUs can be employed to solve big linear systems of equations, to evaluate the Fast Fourier Transform (FFT) of 2D or 3D data, running weather forecasting, evaluating protein alignment/protein folding, computing physical simulations, etc., all those things that were resourced to parallel supercomputers. This way of using the modern GPU is known as general-purpose computing on GPU (GPGPU).

In the following we will focus on some image processing algorithms that can also be used to obtain interesting visual effects on our client. We will see how to blur an image locally, and how to use such operations to obtain a depth-of-field effect, how to extract image edges to obtain some unnatural but interesting-looking effects on our client, and finally how to enhance the details of the rendering by increasing the sharpening of the image generated.

10.1.1 Blurring

Many image filter operations can be expressed as a weighted summation over a certain region of the input image I. Figure 10.2 shows this process.

Figure 10.2

Figure showing a generic filter of 3 × 3 kernel size. As we can see, the mask of weights of the filter is centered on the pixel to be filtered.

A generic filter of 3 × 3 kernel size. As we can see, the mask of weights of the filter is centered on the pixel to be filtered.

Mathematically, the value of the pixel (x0, y0) of the filtered image I′ can be expressed as:

I′(x0,y0)=1T∑x=x0−Nx0+N∑y=y0−My0+MW(x+N−x0,y+M−y0)I(x,y)(10.1) $\begin{matrix} I^{'} (x_{0}, y_{0}) = \frac{1}{T} \sum_{x = x_{0} - N}^{x_{0} + N} \sum_{y = y_{0} - M}^{y_{0} + M} W (x + N - x_{0}, y + M - y_{0}) I (x, y) & (10.1) \end{matrix}$

where N and M are the radius of the filtering window, W (x, y) is the matrix of weights that defines the filter, and T is the sum of the absolute values of the weights, which acts as a normalization factor. The size of the filtering window defines the support of the filter and it usually called the filter kernel size. The total number of pixels involved in the filtering is (2N + 1)(2M + 1) = 4NM + 2(N +M) + 1. We underline that, typically, the window is a square and not a rectangle (that is, N = M).

In its simpler form, a blurring operation can be obtained simply by averaging the values of the pixels on the support of the filter. Hence, for example, for N = M = 2, the matrix of weights corresponding to this operation is:

W(i,j)=⎡⎣⎢⎢⎢⎢⎢⎢1111111111111111111111111⎤⎦⎥⎥⎥⎥⎥⎥(10.2) $\begin{matrix} W (i, j) = [\begin{array}{l} 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \end{array}] & (10.2) \end{matrix}$

and T is equal to 25. As we can see, W(i, j) represents a constant weighting function. In this case, Equation (10.1) can be seen as the convolution of the image with a box function. This is the reason why this type of blur filter is usually called a box filter. Obviously, the blur effect increases as the size of the window increases. In fact, in this way the pixels’ values are averaged on a wider support. Figure 10.3 shows an example of applying this filter to an image (the RGB color channels are filtered separately).

Figure 10.3

Figure showing (Left) Original image. (Right) Image blurred with a 9 × 9 box filter (N = M = 4).

(Left) Original image. (Right) Image blurred with a 9 × 9 box filter (N = M = 4).

The blur obtained by using the box filter can also be obtained with other averaging functions. An alternative option can be to consider the pixels closer to the central pixel (x0, y0) more influencing than the ones distant from it. To do so, usually a Gaussian function is employed as a weighting function. A 2D Gaussian is defined as:

g(x,y)=12πσe−(x2+y2)2σ2(10.3) $\begin{matrix} g (x, y) = \frac{1}{2 π σ} e^{- \frac{(x^{2} + y^{2})}{2 σ^{2}}} & (10.3) \end{matrix}$

The support of this function (that is, the domain over which it is defined) is all the ℝ2 plane, but practically it can be limited considering that when the distance from the origin ((x2+y2)−−−−−−−−√) $(\sqrt{(x^{2} + y^{2})})$ is higher than 3σ, the Gaussian values go very close to zero. So, it is good practice to choose the support of the Gaussian kernel dependent on the value of σ.

By plugging the Gaussian function into Equation (10.1) we obtain the so-called Gaussian filter:

I′(x,y)=∑x=x0−Nx0+N∑y=y0−Ny=y0+NI(x,y)e(−(x−x0)2+(y−y0)22σ)∑x=x0−Nx0+N∑y=y0−Ny=y0+Ne(−(x−x0)2+(y−y0)22σ)(10.4) $\begin{matrix} I^{'} (x, y) = \frac{\sum_{x = x_{0} - N}^{x_{0} + N} \sum_{y = y_{0} - N}^{y = y_{0} + N} I (x, y) e^{(- \frac{{(x - x_{0})}^{2} + {(y - y_{0})}^{2}}{2 σ})}}{\sum_{x = x_{0} - N}^{x_{0} + N} \sum_{y = y_{0} - N}^{y = y_{0} + N} e^{(- \frac{{(x - x_{0})}^{2} + {(y - y_{0})}^{2}}{2 σ})}} & (10.4) \end{matrix}$

Concerning the kernel size, for what was just stated, it is good practice to set N equal to 3σ or 2σ. For a Gaussian filter, a weighting matrix of 7 × 7 with σ = 1 pixels is defined by the following coefficients:

W(i,j)=110000⎡⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢0.22.410.717.710.72.40.22.429.2130.6215.4130.629.22.410.7130.6585.5965.3585.5130.610.717.7215.4965.31591.5965.3215.417.710.7130.6585.5965.3585.5130.610.72.429.2130.6215.4130.629.22.40.22.410.717.710.72.40.2⎤⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥(10.5) $\begin{matrix} W (i, j) = \frac{1}{10000} [\begin{array}{l} 0.2 & 2.4 & 10.7 & 17.7 & 10.7 & 2.4 & 0.2 \\ 2.4 & 29.2 & 130.6 & 215.4 & 130.6 & 29.2 & 2.4 \\ 10.7 & 130.6 & 585.5 & 965.3 & 585.5 & 130.6 & 10.7 \\ 17.7 & 215.4 & 965.3 & 1591.5 & 965.3 & 215.4 & 17.7 \\ 10.7 & 130.6 & 585.5 & 965.3 & 585.5 & 130.6 & 10.7 \\ 2.4 & 29.2 & 130.6 & 215.4 & 130.6 & 29.2 & 2.4 \\ 0.2 & 2.4 & 10.7 & 17.7 & 10.7 & 2.4 & 0.2 \end{array}] & (10.5) \end{matrix}$

Note that at the borders of the matrix, where the distance becomes 3σ, the values go quickly to zero. A graphical representation of these weights is shown in Figure 10.4, while an example of an application of this filter is shown in Figure 10.5.

Figure 10.4

Figure showing weights of a 7 × 7 Gaussian filter.

Weights of a 7 × 7 Gaussian filter.

Figure 10.5

Figure showing (Left) Original image. (Right) Image blurred with a 9 × 9 Gaussian filter (σ= 1.5 pixels).

(Left) Original image. (Right) Image blurred with a 9 × 9 Gaussian filter (σ = 1.5 pixels).

10.1.2 Upgrade Your Client: A Better Photographer with Depth of Field

So far, we have assumed to see the world through an ideal pinhole camera, but a real camera is far from ideal in many senses and one of these is that often we do not see everything in focus, that is, with well-defined contours and details. If you try, with a photographic camera, to frame an object very close to the camera, you will note that the background will appear “blurred,” that is, out of focus. On the contrary, if you try to frame on some far point of the scene, the close objects will appear out of focus. Figure 10.6 shows an example of an out-of-focus background.

Figure 10.6

Figure showing out-of-focus example. The scene has been captured such that the car is in focus while the rest of the background is out of focus. The range of depth where the objects framed are in focus is called depth of field of the camera. (Courtesy of Francesco Banterle.)

Out-of-focus example. The scene has been captured such that the car is in focus while the rest of the background is out of focus. The range of depth where the objects framed are in focus is called depth of field of the camera. (Courtesy of Francesco Banterle.)

The reason why this happens is illustrated in Figure 10.7. The camera lenses make rays leaving at distance d from the lenses to focus on the image plane. Away from this distance, rays leaving from the same point do not meet behind the lenses exactly on the image plane, but they meet closer to the lenses than the image plane (Top) or farther away (Bottom). In both cases the rays coming from the same point in space do not focus on a single point on the image plane but in a circular region, called a circle of confusion, causing the image look not in focus. The radius of the circle of confusion grows linearly with the distance of the point from d along the optical axis and its impact on the sharpness of the produced image will be tolerable within a certain range from d. Such a range is called depth of field.

Figure 10.7

Figure showing depth of field and circle of confusion.

Depth of field and circle of confusion.

In the following we will use blurring to recreate this effect. We indicate with [z1, z2] the range of depths from which the 3D objects are in focus. Then, for each z that is not in this range, we add linearly a blurring effect in order to simulate the defocus effect of a real camera. To do this, we express the value of the radius (in pixels) of the circle of confusion, and hence the kernel size of the blurring filter, in the following way:

⎧⎩⎨⎪⎪⎪⎪c=Rmaxz1−near(z1−z)c=0c=Rmaxz1−near(z−z2)z<z1z1≤z≤z2z>z2(10.6) $\begin{matrix} {\begin{array}{l} c = \frac{R_{max}}{z_{1} - n e a r} (z_{1} - z) & z < z_{1} \\ c = 0 & z_{1} \leq z \leq z_{2} \\ c = \frac{R_{max}}{z_{1} - n e a r} (z - z_{2}) & z > z_{2} \end{array} & (10.6) \end{matrix}$

The value of c is clamped in the range [0.0, Rmax] to prevent increasing the kernel size too much.

10.1.2.1 Fullscreen Quad

Now, we will see a technique that mostly operates in post processing: to take the result of the rendering plus some more data and process it to produce the final image. Blurring is the first of several examples of this sort.

The standard way to do this is:

Render the scene to a texture.
Bind this texture as source.
Render a quad covering the screen exactly and with texture coordinates equal to (0, 0), (1, 0), (1, 1), (0, 1). Typically this is done by drawing in NDC space and hence the quad has coordinates (−1, −1, −1), (1, −1, −1), (1, 1, −1) and (−1, 1, −1). This is called the fullscreen quad.

By rendering a fullscreen quad we activate a fragment for each pixel and so we have access to all the pixels of the scene rendered at step 1.

Listing 10.1 shows the salient part of the JavaScript code. From line 201 to line 213 we render the scene to store the depth buffer, just like we did in Section 8.3 for shadow mapping. In fact, we reuse the same frame buffer, variables and shader. From lines 215 to 221 we render the scene again, this time to store the color buffer.

In principle we would not need to render the scene twice if we had a multiple render target. This functionality, not in the WebGL API at the time of this writing, allows you to output on multiple buffers simultaneously, so that the same shader may write the color on one buffer and some other value on another. The only change in the shader language is that we would have gl_FragColor[i] in the fragment shader, with i the index of the buffer to render to.

Finally, in lines 230-243 we render a full screen quad binding the textures filled in the two previous renderings and enabling the depthOfFieldShader that we will comment on next. Note that at line 233 we specify the depth of field with two values that we mean to be in meters. This is an important bit because we must take care of the reference systems when we read the depth from the texture, where we will read the values in the interval [0, 1] and compare them with values in meters. More specifically, we know that the value of zV (that is, z in view space) will be transformed by the perspective projection as:

zNDC=f+nAf−n+2fnf−nB1zV $z_{N D C} = \frac{\overset{A}{\overset{︷}{f + n}}}{f - n} + \overset{B}{\overset{︷}{2 \frac{f n}{f - n}}} \frac{1}{z_{V}}$

(check multiplying [x, y, z, 1]T by the perspective matrix Ppersp in 4.10) and then to [0, 1] as:

z01=(zNDC+1)/2 $z_{01} = (z_{N D C} + 1) / 2$

In the fragment shader, shown in Listing 10.2, we read the depth values from the texture and they are in the interval [0, 1] (see line 40): we have to invert the transformation to express them in view space and test them with the depth of field interval (lines 41-42). This is why we pass to the shader the values A and B, because they are the entries of the perspective matrix necessary to invert the transformation from [0, 1] to view space.

You may wonder why we don’t make it simpler and pass the depth of field interval directly in [0, 1]. We could, but if we want to be able to express our interval in meters, which is something the user expects, we should at least transform from view space to [0, 1] in the JavaScript side. Consider the function ComputeRadiusCoC. As it is, the radius would not increase linearly with the distance from the interval extremes but with the distance of their reciprocals (you can check this by plugging the above equations into the functions). This does not mean it would not work, but we would not have implemented what is described by Equation (10.6).

200 if (this.depth_of_field_enabled) {

201  gl.bindFramebuffer(gl.FRAMEBUFFER, this.shadowMapTextureTarget.framebuffer);

203  this.shadowMatrix = SglMat4.mul(this.projectionMatrix, this.stack.matrix);

204  this.stack.push();

205  this.stack.load(this.shadowMatrix);

207  gl.clearColor(1.0, 1.0, 1.0, 1.0);

208  gl.clear(gl.COLOR_BUFFER_BIT | gl.DEPTH_BUFFER_BIT);

209  gl.viewport(0, 0, this.shadowMapTextureTarget.framebuffer. width, this.shadowMapTextureTarget.framebuffer.height);

210  gl.useProgram(this.shadowMapCreateShader);

211  gl.uniformMatrix4fv(this.shadowMapCreateShader. uShadowMatrixLocation, false, this.stack.matrix);

212  this.drawDepthOnly(gl);

213  this.stack.pop();

215  gl.bindFramebuffer(gl.FRAMEBUFFER, this. firstPassTextureTarget.framebuffer);

216  gl.clearColor(1.0, 1.0, 1.0, 1.0);

217  gl.clear(gl.COLOR_BUFFER_BIT | gl.DEPTH_BUFFER_BIT);

218  gl.viewport(0, 0, this.firstPassTextureTarget.framebuffer.width, this.firstPassTextureTarget.framebuffer.height);

219  this.drawSkyBox(gl);

220  this.drawEverything(gl, false, this.firstPassTextureTarget.framebuffer);

221  gl.bindFramebuffer(gl.FRAMEBUFFER, null);

223  gl.viewport(0, 0, width, height);

224  gl.disable(gl.DEPTH_TEST);

225  gl.activeTexture(gl.TEXTURE0);

226  gl.bindTexture(gl.TEXTURE_2D, this.firstPassTextureTarget.texture);

227  gl.activeTexture(gl.TEXTURE1);

228  gl.bindTexture(gl.TEXTURE_2D, this.shadowMapTextureTarget.texture);

230  gl.useProgram(this.depthOfFieldShader);

231  gl.uniform1i(this.depthOfFieldShader.uTextureLocation, 0);

232  gl.uniform1i(this.depthOfFieldShader.uDepthTextureLocation, 1);

233  var dof = [10.0, 13.0];

234  var A = (far + near) / (far − near);

235  var B =2 * far * near / (far − near);

236  gl.uniform2fv(this.depthOfFieldShader.uDofLocation, dof);

237  gl.uniform1f(this.depthOfFieldShader.uALocation, A);

238  gl.uniform1f(this.depthOfFieldShader.uBLocation, B);

240  var pxs = [1.0 / this.firstPassTextureTarget.framebuffer.width, 1.0 / this.firstPassTextureTarget.framebuffer.width];

241  gl.uniform2fv(this.depthOfFieldShader.uPxsLocation, pxs);

243  this.drawObject(gl, this.quad, this.depthOfFieldShader);

244  gl.enable(gl.DEPTH_TEST);

245}

LISTING 10.1: Depth of field implementation (JavaScript side). (Code snippet from http://envymycarbook.com/chapter10/0/0.js.)

14  precision highp float;

15  const int MAXRADIUS ="+ constMAXRADIUS+";

16  uniform sampler2D uDepthTexture;

17  uniform sampler2D uTexture;

18  uniform float uA, uB;

19  uniform float near;

20  uniform vec2 uDof;

21  uniform vec2 uPxs;

22  varying vec2 vTexCoord;

23  float Unpack(vec4 v){

24 return v.x + v.y / (256.0) +

25  v.z/(256.0*256.0)+v.w/ (256.0*256.0*256.0);

26 }

27 float ComputeRadiusCoC(float z) {

28 float c = 0.0;

29 //circle of confusion is computed here

30 if (z < uDof[0])

31  c = float(MAXRADIUS)/(uDof[0]−near)*(uDof[0]−z);

32 if(z > uDof[1])

33  c = float(MAXRADIUS)/(uDof[0]−near)*(z−uDof[1]);

34 //clamp c between 1.0 and 7.0 pixels of radius

35 if (int(c) > MAXRADIUS)

36  return float(MAXRADIUS);

37 else

38  return c;

39}

40  void main(void)

41  {

42 float z_01 =Unpack(texture2D(uDepthTexture, vTexCoord));

43 float z_NDC = z_01*2.0−1.0;

44 float z_V = −uB/(z_NDC−uA);

45 int radius = int(ComputeRadiusCoC(z_V));

46 vec4 accum_color = vec4(0.0, 0.0, 0.0, 0.0);

48 for (int i = −MAXRADIUS ; i <= MAXRADIUS ; ++i)

49  for (int j = −MAXRADIUS ; j <= MAXRADIUS ; ++j)

50  if ((i >= −radius) && (i <= radius)

51   && (j >= −radius) && (j <= radius))

52    accum_color += texture2D(uTexture,

53   vec2(vTexCoord.x +float(i) *uPxs[0],

54     vTexCoord.y+float(j) *uPxs[1]));

55 accum_color /= vec4((radius*2+1)*(radius*2+1));

56 vec4 color = accum_color;

57  // if(radius > 1) color+=vec4(1, 0, 0, 1);

58  gl_FragColor = color;

LISTING 10.2: Depth of field implementation (shader side). (Code snippet from http://envymycarbook.com/code/chapter10/0/shaders.js.)

Note that we cannot directly use the value of the radius computed at line 45 in the loop of the filter, because the shader compiler must be able to unroll the loop. Therefore, we place the maximum kernel size as cycle limits and test the fragment distance from the kernel center to zero contribution outside the kernel size.

The same operations can be done more efficiently by splitting the computation of the blurred image in a first “horizontal step,” where we sum the values only along the x axis, and a “vertical step” where we sum on the result of the previous step vertically. The final result will be the same, but the rendering will be faster because now we apply N + M operations per pixel rather than N × M. Figure 10.8 shows a snapshot from the photographer view with depth of field. As it is, this solution can create some artifacts, the most noticeable of which are due to the depth discontinuities. Suppose we have one object close to the camera and that it is in focus. What happens around the silhouette of the object is that the parts of the object out of focus are influenced by those of the background that are in focus, with the final effect that the border between the two will always look a bit fuzzy. These problems may be partially overcome by not counting pixels whose depth value is too different from the one of the pixel being considered. Another improvement may be to sample more than a single value of the depth map and blur the color accordingly.

Figure 10.8

Figure showing snapshot of the depth of field client. (See client http://envymycarbook.com/chapter10/0/0.html.)

Snapshot of the depth of field client. (See client http://envymycarbook.com/chapter10/0/0.html.)

An alternative approach consists of rendering the scene multiple times from positions in a circle around the current point of view and setting the frustum so that it always passes through the same rectangle at the focus distance d. The images so generated are accumulated and the final image is obtained by averaging them. In this manner, everything at distance d will project exactly on the same point on the image plane and the rest will be progressively blurred depending on its distance.

10.1.3 Edge Detection

Many algorithms have been developed to extract salient features from a given image. One of the most important classes of such algorithms is the one that attempts to identify and extract the edges of an image. Here, we describe some basic filters to do this task, in particular the Prewitt and the Sobel filter. Both these filters are based on the numerical approximation of the first order horizontal and vertical derivatives of the image.

First of all, let say something about the numerical approximations of first order derivatives. It is known that the first order derivative of a real function f(x) is calculated as:

df(x)dx=limδ→0f(x+δ)−f(x)δ(10.7) $\begin{matrix} \frac{d f (x)}{d x} = lim_{δ \to 0} \frac{f (x + δ) - f (x)}{δ} & (10.7) \end{matrix}$

This computation can be approximated as:

df(x)dx=f(x+δ)−f(x)δ(10.8) $\begin{matrix} \frac{d f (x)}{d x} = \frac{f (x + δ) - f (x)}{δ} & (10.8) \end{matrix}$

for some small value of δ. In the discrete case, Equation (10.8) can be rewritten as:

Δx(xi)=f(xi+1)−f(xi)(10.9) $\begin{matrix} Δ_{x} (x_{i}) = f (x_{i + 1}) - f (x_{i}) & (10.9) \end{matrix}$

where f(xi) is the discretized function, that is, the i-th sample of the function f(.). This numerical approximation of the derivative is called forward differences. Alternative definitions are the backward differences:

Δx(xi)=f(xi)−f(xi−1)(10.10) $\begin{matrix} Δ_{x} (x_{i}) = f (x_{i}) - f (x_{i - 1}) & (10.10) \end{matrix}$

and the central differences:

Δx(xi)=f(xi+1)−f(xi−1)2(10.11) $\begin{matrix} Δ_{x} (x_{i}) = \frac{f (x_{i + 1}) - f (x_{i - 1})}{2} & (10.11) \end{matrix}$

Considering a digital image, which is defined on a 2D discrete domain, the image gradient is the vector Δ = (Δx, Δy) where Δx is the horizontal derivative and Δy is the vertical derivative. Using the central differences the image gradient can be computed as:

Δ(x,y)=(Δx(x,y)Δy(x,y))=(I(x+1,y)−I(x−1,y)I(x,y−1)−I(x,y+1))(10.12) $\begin{matrix} Δ (x, y) = (\begin{matrix} Δ_{x} (x, y) \\ Δ_{y} (x, y) \end{matrix}) = (\begin{matrix} I (x + 1, y) - I (x - 1, y) \\ I (x, y - 1) - I (x, y + 1) \end{matrix}) & (10.12) \end{matrix}$

In this case Δx and Δy represents the discrete version of the partial derivatives ∂I(x,y)/∂x $\partial I (x, y) / \partial x$ and $\partial I (x, y) / \partial y$ , respectively.

At this point, it is easy to define the “strength” of an edge as the magnitude of the gradient:

$\begin{matrix} ℰ (x, y) = \sqrt{Δ_{x}^{2} (x, y) + Δ_{y}^{2} (x, y)} . & (10.13) \end{matrix}$

We indicate with ℰ the resulting extracted edge image.

So, taking into account Equation (10.13), the edge response at pixel (x0, y0) given an input image I(x, y) can be easily written in matrix form as:

$\begin{array}{l} I_{h} (x_{0}, y_{0}) & = & \sum_{x = x_{0} - 1}^{x_{0} + 1} \sum_{y = y_{0} - 1}^{y_{0} + 1} W_{Δ_{x}} (x + 1 - x_{0}, y + 1 - y_{0}) I (x, y) \\ I_{v} (x_{0}, y_{0}) & = & \sum_{x = x_{0} - 1}^{x_{0} + 1} \sum_{y = y_{0} - 1}^{y_{0} + 1} W_{Δ_{y}} (x + 1 - x_{0}, y + 1 - y_{0}) I (x, y) \\ ℰ (x_{0}, y_{0}) & = & \sqrt{I_{h}^{2} (x_{0}, y_{0}) + I_{v}^{2} (x_{0}, y_{0})} & (10.14) \end{array}$

where Ih(x, y) is the image of the horizontal derivative, Iv(x, y) is the image of the vertical derivative, and $W_{Δ_{x}} (i, j)$ and $W_{Δ_{y}} (i, j)$ are the matrix of weights defined as:

$\begin{matrix} W_{Δ_{x}} = [\begin{array}{l} 0 & 0 & 0 \\ - 1 & 0 & 1 \\ 0 & 0 & 0 \end{array}] & W_{Δ_{y}} = [\begin{array}{l} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & - 1 & 0 \end{array}] & (10.15) \end{matrix}$

The filter (10.14) is the most basic filter to extract the edge based on first order derivatives.

Two numerical approximations of the first order derivatives that are more accurate than the one just described are provided, respectively, by exploiting the Prewitt operator:

$\begin{matrix} W_{Δ_{x}} = [\begin{array}{l} - 1 & 0 & 1 \\ - 1 & 0 & 1 \\ - 1 & 0 & 1 \end{array}] & W_{Δ_{y}} = [\begin{array}{l} 1 & 1 & 1 \\ 0 & 0 & 0 \\ - 1 & - 1 & - 1 \end{array}] & (10.16) \end{matrix}$

and the Sobel operator:

$\begin{matrix} W_{Δ_{x}} = [\begin{array}{l} 1 & 0 & - 1 \\ 2 & 0 & - 2 \\ 1 & 0 & - 1 \end{array}] & W_{Δ_{y}} = [\begin{array}{l} - 1 & - 2 & - 1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{array}] & (10.17) \end{matrix}$

By replacing the matrix of weights in Equation (10.14) we obtain a more accurate result in edge computation. Notice that these kernels have weights similar to the ones of the matrix obtained with the central difference approximation. Figure 10.9 shows the results obtain with these two edge detectors. As in the other filtering examples, the filters are applied separately to each image color channel.

Figure 10.9

Figure showing (Left) Original image. (Center) Prewitt filter. (Right) Sobel filter.

(Left) Original image. (Center) Prewitt filter. (Right) Sobel filter.

10.1.4 Upgrade Your Client: Toon Shading

With toon shading, or cel shading, we refer to a rendering technique capable of making the look of our client similar to the look of a cartoon. This type of rendering has been widely used in videogames (The Legend of Zelda-The Wind Waker by Nintendo, to mention one). Toon shading belongs to the so-called Non-Photorealistic Rendering (NPR) techniques. NPR aims at producing images that are not necessarily photorealistic in favor of more artistic and illustrative styles. The term NPR is debated, mainly because it tries to define something by saying what it is not and also becase it sounds like a diminishing definition: since in CG we aim at photorealism, then an NPR technique is simply something that does not work?

Most often, NPR techniques try to make a rendering look like it was hand-drawn. In our client, we will do this by combining edge detection and color quantization. More specifically, we will make our toon shading effect by using two simple tricks.

The first trick is to draw black edges on the contour of the objects. Like for the depth-of-field client, we render the scene and then make a fullscreen quad to process the result. Then we calculate the edges in screen-space by using the theory shown in Section 10.1.3 and produce an edge map with the filter described by Equation (10.14) and the kernel of the Sobel operator. The edge map is a single-channel image where the intensity value indicates the “edgeness” of the pixel. By assuming that edges we are interested in drawing are the “strong” ones, we define a threshold over which the pixel is considered to be on an edge and make so that that pixel is drawn in black on the final image. In this code example we use a fixed threshold but an adaptive one may provide better results. The code in Listing 10.3 shows the fragment shader to extract the edge map. colorSample is the texture containing the scene rendered and applied to the fullscreen quad. Note that the strength of the edge is summarized as the mean of the strength of the edge of each color channel.

37 float edgeStrength (){

38 vec2 tc = vTextureCoords;

39 vec4 deltax = texture2D(uTexture,tc+vec2(-uPxs.x,uPxs.y))

40  +texture2D(uTexture,tc+vec2(-uPxs.x,0.0))*2.0

41  +texture2D(uTexture,tc+vec2(-uPxs.x,-uPxs.y))

42 -texture2D(uTexture,tc+vec2(+uPxs.x,+uPxs.y))

43 -texture2D(uTexture,tc+vec2(+uPxs.x,0.0))*2.0

44 -texture2D(uTexture,tc+vec2(+uPxs.x,-uPxs.y));

46 vec4 deltay = -texture2D(uTexture,tc+vec2(-uPxs.x,uPxs.y))

47 -texture2D(uTexture,tc+vec2(0.0,uPxs.y))*2.0

48 -texture2D(uTexture,tc+vec2(+uPxs.x,uPxs.y))

49 +texture2D(uTexture,tc+vec2(-uPxs.x,-uPxs.y))

50 +texture2D(uTexture,tc+vec2(0.0,-uPxs.y))*2.0

51 +texture2D(uTexture,tc+vec2(+uPxs.x,-uPxs.y));

53 float edgeR = sqrt(deltax.x*deltax.x + deltay.x*deltay.x);

54 float edgeG = sqrt(deltax.y*deltax.y + deltay.y*deltay.y);

55 float edgeB = sqrt(deltax.z*deltax.z + deltay.z*deltay.z);

56 return (edgeR + edgeG + edgeB) / 3.0;}

LISTING 10.3: Code to compute the edge strength. (Code snippet from http://envymycarbook.com/chapter10/1/shaders.js.)

The second trick is to quantize the shading values in order to simulate the use of a limited set of colors in the scene. In particular, here we use a simple diffuse model with three levels of quantization of the colors: dark, normal and light. In this way, a green object will result in some parts colored with dark green, in other parts colored in green and in other parts colored in light green. The code that implements this simple quantized lighting model is given in Listing 10.4.

23 vec4 colorQuantization(vec4 color){

24  float intensity = (color.x+color.y+color.z)/3.0;

25  // normal

26  float brightness = 0.7;

27  //dark

28  if (intensity < 0.3)

29  brightness = 0.3;

30  //light

31  if(intensity > 0.8)

32  brightness = 0.9;

33  color.xyz = color.xyz * brightness / intensity;

34  return color ;}

LISTING 10.4: A simple quantized-diffuse model. (Code snippet from http://envymycarbook.com/chapter10/1/shaders.js.)

We follow the same scheme as for the depth-of-field client, but this time we need to produce only the color buffer and then we can render the full screen quad. The steps are the following:

Render the scene to produce the color buffer
Bind the texture produced at step 1 and render the full screen quad. For each fragment, if it is on a strong edge output black, otherwise output the color quantized version of the diffuse lighting (see Listing 10.5).

Figure 10.10 shows the final result.

Figure 10.10

Figure showing toon shading client. (See client http://envymycarbook.com/chapter10/!/1. html.)

Toon shading client. (See client http://envymycarbook.com/chapter10/!/1.html.)

57 void main(void)

58 {

59 vec4 color;

60 float es = edgeStrength();

61  if(es > 0.15)

62  color = vec4(0.0, 0.0, 0.0, 1.0);

63 else{

64  color = texture2D(uTexture, vTextureCoords);

65  color = colorQuantization(color);

66}

67 gl_FragColor = color;

68} ";

LISTING 10.5: Fragment shader for the second pass. (Code snippet from http://envymycarbook.com/chapter10/1/shaders.js.)

There are many approaches to obtain more sophisticated toon shading. In this simple implementation we only used the color buffer, but we may consider also performing edge extraction in the depth buffer, and in this case we would need another rendering to produce it like we did in Section 10.1.2. For a complete treatment of them and for an overview of NPR techniques we refer to Reference [1].

10.1.5 Upgrade Your Client: A Better Photographer with Panning

We already left the pinhole camera model in Section 10.1.2 by adding the depth of field. Now we will also consider another aspect of a real photograph: the exposure time. Up to now, we considered the exposure time as infinitely small, so that every object is perfectly “still” during the shot, no matter how fast it travels. Now we want to emulate reality by considering that the exposure time is not infinitely small and the scene changes while the shutter is open. Figure 10.11 illustrates a situation where, during the exposure time, the car moves from left to right. What happens in this case is that different points on the car surface will project into the same pixel, all of them contributing to the final color. As a result, the image will be blurred in the regions where the moving objects have been. This type of blur is named motion blur and in photography it is used to obtain the panning effect, which is when you have the moving object in-focus and the background blurred. The way it is done is very simple: the photographer aims at the moving object while the shutter is open, making it so that the relative motion of the object with respect to the camera frame is almost 0. In Section 4.11.2 we added a special view mode to the photographer such that it constantly aims at the car. Now we will emulate motion blur so that we can reproduce this effect.

Figure 10.11

Figure showing motion blur. Since the car is moving by Δ during the exposure, the pixel value in x′(t+ dt) is an accumulation of the pixels ahead in the interval x′(t + dt) +Δ.

Motion blur. Since the car is moving by Δ during the exposure, the pixel value in x′(t+ dt) is an accumulation of the pixels ahead in the interval x′(t + dt) + Δ.

The most straightforward way to emulate motion blur is to simply mimic what happens in reality, that is, taking multiple renderings of the scene within the exposure interval and averaging the result. The drawback of this solution is that you need to render the scene several times and it may become a bottleneck. We will implement motion blur in a more efficient way, as a postprocessing step. First we need to calculate the so called velocity buffer, that is, a buffer where each pixel stores a velocity vector indicating the velocity at which the point projecting on that pixel is moving in screen space. When we have the velocity buffer we output the color for a pixel in the final image just by sampling the current rendering along the velocity vector associated with the pixel, as shown in Figure 10.12.

Figure 10.12

Figure showing velocity vector. (See client from http://envymycarbook.com/chapter10/2/2.html.)

Velocity vector. (See client from http://envymycarbook.com/chapter10/2/2.html.)

10.1.5.1 The Velocity Buffer

The creation of the velocity buffer is usually treated in two separate situations: when the scene is static and the camera is moving or when some object is moving and the camera is fixed. Here we deal with a unified version of the problem where all the motion is considered in the camera reference frame.

Note that the basic procedure that we followed to handle the geometric transformation of our primitives coordinates is to pass to the shader program the modelview and the projection matrix. Then in the vertex shader we always have a line of code that transforms the position from object space to clip space:

glPosition = uProjectionMatrix * uModelViewMatrix * vec4(aPosition, 1.0); No matter if the camera is fixed or not or if the scene is static or not, this expression will always transform the coordinates from object to clipspace (and hence in window coordinates). Assuming that the projection matrix does not change (which is perfectly sound since you do not zoom during the click of the camera), if we store the modelview matrix for each vertex of the previous frame and pass it along with the one of the current frames to the shader, we will be able to compute, for each vertex, its position on screen space at the previous and at the current frame, so that their difference is the velocity vector.

So we have to change our code to keep track, for each frame, of the value of the modelview matrix at the previous frame (that is, stack.matrix in the code). Since every element of the scene we draw is a JavaScript object of our NVMCClient, we simply extend every object with a member to store the modelview matrix at the previous frame. Listing 10.6 shows the change applied to the drawing of the trees: at line 89, after the tree trees[i] has been rendered, we store the modelview matrix in trees[i].previous_transform. This is the value that will be passed to the shader that computes the velocity buffer.

84 for(var i in trees){

85  var tpos = trees[i].position;

86  this.stack.push();

87  this.stack.multiply(SglMat4.translation(tpos));

88  this.drawTreeVelocity(gl,trees[i].previous_transform);

89  trees[i].previous_transform =this.stack.matrix;

90  this.stack.pop();

91}

LISTING 10.6: Storing the modelview matrix at the previous frame. (Code snippet from http://envymycarbook.com/chapter10/2/2.js.)

Listing 10.7 shows the shader program to compute the velocity buffer. We pass both the modelview matrix at the current and previous frame and make the fragment shader interpolate the two positions, so that we can compute the velocity vector for the pixel as the difference between the two interpolated values. At lines 111-112 we perform the perspective division to obtain the coordinate in NDC space, at line 113 we compute the velocity vector; and at line 114 we remap the vector from [—1, 1]2 to [0, 1]2, so that we can output it as a color by writing the x and y component of the velocity vector on the red and green channel, respectively.

90 var vertex_shader = "

91 uniform  mat4 uPreviousModelViewMatrix;

92 uniform  mat4 uModelViewMatrix;

93 uniform  mat4 uProjectionMatrix;

94 attribute vec3 aPosition;

95 varying vec4 prev_position;

96 varying vec4 curr_position;

97 void main(void)

98 {

99  prev_position = uProjectionMatrix* uPreviousModelViewMatrix *vec4(aPosition, 1.0);

100 curr_position = uProjectionMatrix*uModelViewMatrix * vec4(aPosition,1.0);

101 gl_Position = uProjectionMatrix*uPreviousModelViewMatrix *vec4(aPosition, 1.0);

102 }

103 " ;

105 var fragment_shader = "

106  precision highp float;

107  varying vec4 prev_position;

108  varying vec4 curr_position;

109  void main(void)

110  {

111 vec4 pp = prev_position / prev_position.w;

112 vec4 cp = curr_position / curr_position.w;

113 vec2 vel = cp.xy- pp.xy;

114 vel = vel*0.5+0.5;

115 gl_FragColor = vec4(vel, 0.0, 1.0);

116}

117 ";

LISTING 10.7: Shader programs for calculating the velocity buffer. (Code snippet from http://envymycarbook.com/chapter10/2/shaders.js.)

Listing 10.8 shows the fragment shader to perform the final rendering with the full screen quad. We have the uVelocityTexture that has been written by the velocityVectorShader and the uTexture containing the normal rendering of the scene. For each fragment, we take STEPS samples of the uTexture along the velocity vector. Since velocity vector is written with only 8 bit precision, the value we read and convert with the function Vel(..) at line 19 is not exactly what we computed with the velocityVectorShader. This is acceptable except when the scene is static (that is, nothing moves at all) and still, because of this approximation, we notice some blurring around the image, so at line 30 we simply set to [0, 0] the too small velocity vectors.

13 var fragment_shader = "

14 precision highp float;

15 const int STEPS =10;

16 uniform sampler2D uVelocityTexture;

17 uniform sampler2D uTexture;

18 varying vec2 vTexCoord;

19 vec2 Vel(vec2 p){

20 vec2 vel = texture2D (uVelocityTexture, p).xy;

21  vel = vel* 2.0- 1.0;

22 return vel;

23}

24 void main(void)

25 {

26 vec2 vel = Vel(vTexCoord);

27 vec4accum_color = vec4(0.0, 0.0, 0.0, 0.0);

29 float l = length(vel);

30 if (l < 4.0/255.0) vel=vec2(0.0, 0.0);

31 vec2 delta = -vel/vec2(STEPS);

32 int steps_done = 0;

33 accum_color= texture2D(uTexture , vTexCoord);

34 for (int i = 1 ; i <= STEPS ; ++i)

35   {

36   vec2 p = vTexCoord + float(i)*delta;

37  if((p.x <1.0) && (p.x > 0.0)

38   && (p.y <1.0) && (p.y >0.0)){

39   steps_done++;

40   accum_color += texture2D(uTexture , p);

41 };

42 }

43 accum_color /= float(steps_done+1);

44 gl_FragColor = vec4(accum_color.xyz ,1.0);

45 }

LISTING 10.8: Shader program for the final rendering of the panning effect. (Code snippet from http://envymycarbook.com/chapter10/2/shaders.js.)

Figure 10.13 shows the panning effect in action in the client.

Figure 10.13

Figure showing a screenshot of the motion blur client. (See client http://envymycarbook.com/chapter10/2/2.html.)

A screenshot of the motion blur client. (See client http://envymycarbook.com/chapter10/2/2.html.)

10.1.6 Sharpen

There are many image enhancement techniques to increase the sharpness of an image. Here, we describe unsharp masking, one of the most used, which consists of improving the visual perception of the image details by extracting them and re-adding them to the original image. Originally, this technique had been developed in the analogical domain by professional photographers. For a complete and interesting description of the original photographing technique we refer the reader to [22].

The extraction of the image details is based on the computation of a smooth/blurred version of I; we call such an image Ismooth. The idea is that a blurred/smooth image contains less high-frequency/medium-frequency details than the original image. So, the details can be computed by simply subtracting Ismooth to I; the image I - Ismooth represents the details of the input image. The amount and the granularity of the details obtained depends on how the image Ismooth is obtained (the kernel size and type, for example if a box filter or a Gaussian filter is used).

The details thus extracted are re-added to the original image so that they are exacerbated, and hence so that our visual system perceives them more clearly than the details of the original image. Mathematically, this can be achieved in the following way:

$\begin{matrix} I_{unsharp} (x, y) = I (x, y) + λ (I (x, y) - I_{smooth} (x, y)) & (10.18) \end{matrix}$

where Iunsharp is the output image with the sharpness increased. The parameter λ is used to tune the amount of details re-added. High values of λ may exacerbate the details too much, thus resulting in an unrealistic look for the image, while low values of λ may produce modifications that are not perceivable. The choice of these depends on the content of the image and on the effect that we want to achieve. Figure 10.14 shows an example of details enhancement using unsharp masking.

Figure 10.14

Figure showing (Left) Original image. (Right) Image after unsharp masking. The Ismooth image is the one depicted in Figure 10.5; λ is set to 0.6.

(Left) Original image. (Right) Image after unsharp masking. The Ismooth image is the one depicted in Figure 10.5; λ is set to 0.6.

10.2 Ambient Occlusion

The ambient occlusion technique is a real-time rendering solution to improve the realism of a local illumination model by taking into account the total amount of light received by a point p of a surface.

As we have just seen and discussed in Chapter 8, a certain part of a scene can receive no lighting or less lighting than another part due to shadows produced by occluders. Also the geometry of the same 3D model can generate self-shadowing effects, causing a point p to receive less light than other surface points (see Figure 10.15).

Figure 10.15

Figure showing occlusion examples. (Left) The point p receives only certain rays of light because it is self-occluded by its surface. (Right) The point p receives few rays of light because it is occluded by the occluders O.

Occlusion examples. (Left) The point p receives only certain rays of light because it is self-occluded by its surface. (Right) The point p receives few rays of light because it is occluded by the occluders O.

The idea of ambient occlusion is to consider how the light coming from all directions may be blocked by some occluder or by the neighboorhood of p on the same surface. We may think of it as a smarter version of the ambient coefficient in the Phong model: instead of assuming that “some light” will reach every point because of global effects, evaluate the neighborhood of p to see how much of such light could actually reach p.

Ambient occlusion is implemented by calculating the fraction of the total amount of light that may possibly arrive at a point p of the surface, and using this quantity, called ambient occlusion term (A), in a local illumination model to improve the realism of the overall shading. The term A is computed in the following way:

$\begin{matrix} A (p) = \frac{1}{2 π} \int_{Ω} V (p, ω) (n_{p} \cdot ω) dω & (10.19) \end{matrix}$

where np is the normal at point p and V(.) is a function, called visibility function, which gets value 1 if the ray originating from p in the direction ω is occluded and 0 otherwise. Since the computation of (10.19) is really computationally expensive, the term A is usually pre-computed and stored for each vertex or texel of the scene, assuming the scene, itself, is static. The integration is achieved by considering a set of directions on the hemisphere and summing up all the contributions. Obviously, the more directions are considered, the more accurate the value of A. The ambient occlusion term goes from 0, which means that no light is received by the point, to 1, when the areas surrounding p are completely free of occluders.

Typically, the ambient occlusion term is used to modulate the ambient component of the Phong illumination model in the following way:

$\begin{matrix} L_{outgoing} = A L_{ambient} + K_{D} L_{diffuse} + K_{S} L_{specular} & (10.20) \end{matrix}$

This local illumination model is able to produce darker parts of the scene where the geometry of the objects causes little lighting to be received. Figure 10.16 shows an example of a 3D model rendered with the standard Phong illumination model and with the per-vertex ambient occlusion term only. Note how using only the ambient occlusion term may greatly increase the perception of the details of the scene (this has been demonstrated by experiments conducted by Langer et al. [21]).

Figure 10.16

Figure showing effect of ambient occlusion. (Left) Phong model. (Right) Ambient occlusion term only. The ambient occlusion term has been calculated with Meshlab (http://meshlab.sourceforge.net/). The 3D model is a simplified version of a scanning model of a capital. (Courtesy of the Kunsthistorisches Institut in Florenz http://www.khi.fi.it/.)

Effect of ambient occlusion. (Left) Phong model. (Right) Ambient occlusion term only. The ambient occlusion term has been calculated with Meshlab (http://meshlab.sourceforge.net/). The 3D model is a simplified version of a scanning model of a capital. (Courtesy of the Kunsthistorisches Institut in Florenz http://www.khi.fi.it/.)

The ambient occlusion term can also be used in other ways and in other illumination models, for example by multiplying a purely diffusive model to make the part exposed to the light brighter than the ones where the light has difficulties in arriving:

$\begin{matrix} L_{reflected} = A (L \cdot N) & (10.21) \end{matrix}$

We would like to underline that this technique has no cost during the rendering phase, since everything is pre-computed. This is why we describe it as a realtime rendering solution.

10.2.1 Screen-Space Ambient Occlusion (SSAO)

The ambient occlusion technique just described has several limitations. One of the most important is the fact that it cannot be applied to dynamic scenes. If an object changes its position in the scene or is deformed by an animation the pre-calculated occlusion term is no longer valid and requires to be re-calculated. Stated in another way, the ambient occlusion previously described can be used only to visualize a static scene. Another limitation is that the pre-computation can require a very long time if the scene is complex and a high number of directions are used to compute the integral (10.19).

Here, we show an alternative way to obtain a visual effect similar to the ambient occlusion but with many advantages: the screen-space ambient occlusion (SSAO). The idea is to compute the ambient occlusion term for each pixel at rendering time instead of pre-computing A for each vertex. This way of proceeding has several advantages: it is independent of the complexity of the scene because it is calculated in screen-space, hence this allows it to be applied also for dynamic scenes; its computational complexity depends on the screen resolution and not on the complexity of the scene.

Many ideas have been proposed in the last few years to develop an efficient way to calculate the ambient occlusion in screen space [1]. The technique we are going to describe is a simplified version of the SSAO technique proposed by Bavoil et al. [3] and then successively improved by Dimitrov et al. [7]. This technique is based on the concept of horizon angle.

Referring to Figure 10.17 (Top-Right), let p be a point on the surface S and np be the normal at point p. Now let plθ be the plane passing by the axis z and forming an angle θ with the plane xy. The intersection between S with this plane produces the section Sθ shown in Figure 10.17 (Bottom). With this notation, we can rewrite equation 10.19 as:

$\begin{matrix} A (p) = \frac{1}{2 π} \int_{θ = - π}^{θ = π} \overset{contribution of section S_{θ}}{\overset{︷}{\int_{α = 0}^{π / 2} V (p, ω (θ, α)) W (θ) d α d θ}} & (10.22) \end{matrix}$

Figure 10.17

Figure showing the horizon angle h(θ) and the tangent angle t(θ) in a specific direction θ.

The horizon angle h(θ) and the tangent angle t(θ) in a specific direction θ.

In the following we will concentrate on the contribution of the inner integral. Let us build a tangent frame on p made by nθ and the tangent vector, which we call xθ. We want to find the range of elevation angle α so that the ray leaving from p intersects Sθ, that is, the values of α for which $V (p, ω (θ, α)) = 1$ . This range is shown as a darker area in Figure 10.17 (Bottom).

Suppose we know this horizon angle and let us call it Hz. Then we can rewrite equation 10.22 as:

$\begin{matrix} A (p) = \frac{1}{2 π} \int_{θ = - π}^{θ = π} \int_{α = 0}^{H z} cos α W (θ) d α d θ & (10.23) \end{matrix}$

because the contribution of the inner integral is 0 for α > 0. Note that we also replaced np ∙ ω(θ, α) with a generic weighting function W(θ) (which we will specify later on) that does not depend on α and so can be taken out of the integral.

Now the interesting part. Hz is a value expressed in the tangent frame, but our representation of the surface S is the depth buffer, which means we can have z values expressed in the frame made by x′ and z. So we find Hz by subtraction of two angles that we can compute by sampling the depth buffer: h(θ) and t(θ). h(θ) is the horizon angle over the x′ axis and t(θ) is the angle formed by the tangent vector xθ and x′. You can easily see that: Hz = h(θ) − t(θ) and hence equation 10.23 becomes:

$\begin{matrix} A (p) = \frac{1}{2 π} \int_{θ = - π}^{π} (sin (h (θ)) - sin (t (θ))) W (ω) d ω & (10.24) \end{matrix}$

Given a point p, and the knowledge of the horizon angles in several directions, allows us to estimate approximately the region of the hemisphere where the rays are not self-occluded. The greater this region, the greater the value of the ambient occlusion term.

Equation (10.24) can be easily calculated at rendering time with a two-pass algorithm. In the first pass the depth map is generated, like in the depth-of-field client (see Section 10.1.2), and used during the second pass to determine the angles h(θ) and t(θ) for each pixel. Obviously, Equation (10.24) is evaluated only for a discrete number Nd of directions (θ0, θ1,...,θNd−1):

$\begin{matrix} A (p) = \frac{1}{2 π} \sum_{i = 0}^{N_{d}} (sin (h (θ_{i})) - sin (t (θ_{i}))) W (θ) & (10.25) \end{matrix}$

where W(θ) is a linear attenuation function depending on the distance r of where the horizon angle is found. In its original formulation, it is set to W(θ) = 1 −r/R. Just 16 directions can provide a good approximation of the real ambient occlusion term. The horizon angle is calculated by walking on the depth map in the specified direction θ and getting the maximum value of it. The walking proceeds in a certain radius of interest R and not on all the depth map. The tangent angle t(θ) is easily determined from the normal at the pixel (see Appendix B).

10.3 Deferred Shading

In Section 5.2.3 we have seen how depth buffering solves the hidden surface removal problem in a simple and sound manner. However, at this point of our book, we have seen how much is going on besides the pure rasterization. Depending on the specific shader programs, lighting and texture accesses can make creating a fragment computationally expensive. This means that if the depths complexity is high, that is, if many surfaces at different depths project on the same pixels on the screen, then a lot of computation is wasted.

The idea of deferred shading is to separate the work for finding out the visible fragments from the work for computing their final color. In a first pass, or geometry pass, the scene is only rasterized without making any shading computation. Instead we output, on several buffers, the interpolated values for the fragment (such as position, normal, color, texture coordinates, etc.) that are needed to compute the final color.

In the second pass, we render a full screen quad and bound this set of buffers, usually referred to as GBuffer, so that for each pixel on the screen we may access all the values written in the first pass and make the shading.

As noted before, since we do not have MRT in WebGL, the first pass will actually consist of at least two renderings: one to store depth and normals and one to store color attribute.

Beside handling depth complexity, another major advantage claimed for deferred shading is that it can easily handle multiple lights. However, we already implemented multiple lights in Section 6.7.4 simply by iterating over all the light sources in the fragment shader and composing their contribution in the final result, so you may wonder why it can be better with deferred shading. The answer is that with deferred shading you can easily combine several shaders, for example one for each light, eliminate iteration and branching in the fragment shader and have a cleaner pipeline. It may not seem like much in a basic example but it will make a world of difference with bigger projects.

There are downsides too. Hardware antialiasing, which we have seen in Section 5.3.3, will be done at rasterization time with the color only and not with the result of shading, so it will simply be plainly wrong. This problem may be alleviated by detecting edges on the image produced at the first pass and blurring them at post processing.

10.4 Particle Systems

With the term particle system we refer to an animation technique that consists of using a large population of particles, which we can picture as zero-dimensional or very small entities, that move in space according to either a predefined scripted behaviour or to a physical simulation creating the illusion of a moving entity without a fixed shape. A wide range of phenomena can be effectively represented by particle systems: smoke, fire, explosions, rain, snow and water, to mention the most used, and how a particle is rendered depends on the phenomenon being represented, for example, a small colored circle for fire or a small line segment for rain.

10.4.1 Animating a Particle System

The animation of a particle system is done by defining the state of the system and the set of functions to make it progress over time, and they depend on the particular visual effect to achieve.

Typically, the dynamic state of a particle consists ot the acceleration, the velocity and the position . For example, for a particle i we may have $x_{i} (t) = (a_{i} (t), v_{i} (t), p_{i} (t))$ . The evolution of this set of particles can be written as:

$\begin{array}{l} x_{i} (t + i) & = & (\begin{matrix} p_{i} (t + 1) \\ v_{i} (t + 1) \\ a_{i} (t + 1) \end{matrix}) & = & (\begin{matrix} f (t, a_{i} (t), v_{i} (t), p_{i} (t)) \\ g (t, a_{i} (t), v_{i} (t), p_{i} (t)) \\ h (t, a_{i} (t), v_{i} (t), p_{i} (t)) \end{matrix}) & = & (\begin{matrix} f (t, x_{i} (t)) \\ g (t, x_{i} (t)) \\ h (t, x_{i} (t)) \end{matrix}) & (10.26) \end{array}$

where the functions f(.), g(.) and h(.) provide the acceleration, velocity and position of the particle at the next time step, given the current acceleration, velocity and position of the particle. These functions are basically of two types: physically-based, attempting to simulate the physical behavior of the phenomenon, or scripted to provide the same visual impression of the phenomenon without any connection with the real physics behind it.

The state of the particle can also be characterized by many other parameters, for example, the color of the particle can be evolved as a function of the time or the position, its shape on the acceleration, and so on.

Moreover, the animation of a particle can also be a function of the properties of other particles, for example:

$\begin{matrix} {\begin{matrix} p_{i} (t + 1) = f (t, a_{1} (t), v_{i} (t), p_{i}^{1} (t), p_{i}^{2} (t), \dots, p_{i}^{k} (t)) \\ v_{i} (t + 1) = g (t, a_{1} (t), v_{i} (t), p_{i}^{1} (t), p_{i}^{2} (t), \dots, p_{i}^{k} (t)) \\ a_{i} (t + 1) = h (t, a_{1} (t), v_{i} (t), p_{i}^{1} (t), p_{i}^{2} (t), \dots, p_{i}^{k} (t)) \end{matrix} & (10.27) \end{matrix}$

In this case the i-th particle is also influenced by the position of the nearest k particles, indicated with $p_{i}^{1} (t), p_{i}^{2} (t), \dots, p_{i}^{k} (t)$ .

The set of particles in a particle system is not fixed. Each particle is created by an emitter and inserted in the system with an initial state, then its state is updated for a certain amount of time and finally is removed. The lifespan of a particle is not always strictly dependent on time. For example, when implementing rain, the particles may be created on a plane above the scene and then removed when they hit the ground. Another example is with fireworks: particles are all created at the origin of the fire (the launcher of the fireworks) with an initial velocity and removed from the system when along their descending parabola. The creation of particles should be randomized to avoid creating visible patterns that jeopardize the final effect.

10.4.2 Rendering a Particle System

The rendering of a particle system also depends on the phenomenon. Often each particle is rendered as a small plane-aligned billboard, which makes sense because there is no parallax to see in a single particle, but we can also have simpler representations like just points or segments. For dense participating media such as smoke, blending will be enabled and set to accumulate the value of the alpha channel, that is, the more particles project on the same pixel the more opaque is the result.

10.5 Self-Exercises

10.5.1 General

Imagine that the generateMipmap is suddenly removed by the WebGl specification! How can we create the mipmap levels of a given texture entirely on the GPU (that is, without readbacks)?
Suppose we iterate the application of a blurring filter with kernel size 5 on an image of 800 × 600 pixels. How many times should we apply the filter for the color of the pixel at position (20, 20) to be influenced by the color at pixel (100, 100)?
Change the Gaussian filter (Equation (10.4)) so that horizontal neighbors of the pixel are weighted more than vertical neighbors.
Suppose the objects of the scene were tagged as convex and non-convex. How could we take advantage of this information to speed up the computation of the ambient occlusion term?
Elaborate on this statement: “Ambient occlusion is none other than the implementation of an all-around light camera for shadow mapping.”

10.5.2 Client Related

For how we implemented it, the rendering of the skybox does not write on the depth buffer. Still, the client implemented in Section 10.1.2 blurs it correctly. How so?
Change the client of Section 10.1.2 so that it does not apply the blurring filter for the fragments of the skybox but still shows the skybox blurred when out of the depth of field.
Make a view mode that loses the focus away from the center of the image.
Improve the toon shading client by also running the edge detection on:
1. The depth buffer
2. The normal buffer. Hint:You have to pack the normals as we did for the depth buffer.
Improve the toon shading client by also making the black edges bold. Hint: Add a rendering pass in order to expand all the strong edge pixels by one pixel in every direction.
Improve the implementation of the lens flares effect (see Section 9.2.4). Hint: Use the fullscreen quad to avoid a rendering pass.
Using only the normal map of the street of Section 7.8.3, create an ambient occlusion map, that is, a texture where each texel stores the ambient occlusion term. Hint: If the dot product of the normal at texel x, y and every one of the normals on the neighbor texels is negative we can put 1 as an ambient occlusion term (that is, not occluded at all).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 10: Advanced Techniques

Create new playlist

Sign In

Sign Up