4. Transforms (9/10)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

4.6. Projections 93

Figure 4.18. The notation used for deriving a perspective projection matrix. The point

p is projected onto the plane z = −d, d>0, which yields the projected point q.The

projection is performed from the perspective of the camera’s location, which in this case

is the origin. The similar triangle used in the derivation is shown for the x-component

at the right.

more closely matches how we perceive the world, i.e., objects further away

are smaller.

First, we shall present an instructive derivation for a perspective pro-

jection matrix that projects onto a plane z = −d, d>0. We derive from

world space to simplify understanding of how the world-to-view conversion

proceeds. This derivation is followed by the more conventional matrices

used in, for example, OpenGL [970].

Assume that the camera (viewpoint) is located at the origin, and that

we want to project a point, p, onto the plane z = −d, d>0, yielding a

new point q =(q

, −d). This scenario is depicted in Figure 4.18. From

the similar triangles shown in this ﬁgure, the following derivation, for the

x-component of q, is obtained:

−d

⇐⇒ q

= −d

. (4.64)

The expressions for the other components of q are q

= −dp

(obtained

similarly to q

), and q

= −d. Together with the above formula, these give

us the perspective projection matrix, P

, as shown here:

⎛

⎜

⎝

1000

0100

0010

00−1/d 0

⎞

⎟

⎠

. (4.65)

94 4. Transforms

Figure 4.19. The matrix P

transforms the view frustum into the unit cube, which is

called the canonical view volume.

That this matrix yields the correct perspective projection is conﬁrmed

by the simple veriﬁcation of Equation 4.66:

q = P

p =

⎛

⎜

⎝

1000

0100

0010

00−1/d 0

⎞

⎟

⎠

⎛

⎜

⎝

⎞

⎟

⎠

⎛

⎜

⎝

−p

⎞

⎟

⎠

⇒

⎛

⎜

⎝

−dp

−d

⎞

⎟

⎠

(4.66)

The last step comes from the fact that the whole vector is divided by the

w-component (in this case −p

/d), in order to get a 1 in the last position.

The resulting z value is always −d since we are projecting onto this plane.

Intuitively, it is easy to understand why homogeneous coordinates al-

low for projection. One geometrical interpretation of the homogenization

process is that it projects the point (p

) onto the plane w =1.

As with the orthographic transformation, there is also a perspective

transform that, rather than actually projecting onto a plane (which is non-

invertible), transforms the view frustum into the canonical view volume

described previously. Here the view frustum is assumed to start at z = n

and end at z = f,with0>n>f. The rectangle at z = n has the mini-

mum corner at (l, b,n) and the maximum corner at (r, t, n). This is shown

in Figure 4.19.

The parameters (l, r, b, t,n, f) determine the view frustum of the cam-

era. The horizontal ﬁeld of view is determined by the angle between the

left and the right planes (determined by l and r) of the frustum. In the

same manner, the vertical ﬁeld of view is determined by the angle between

the top and the bottom planes (determined by t and b). The greater the

ﬁeld of view, the more the camera “sees.” Asymmetric frustums can be

created by r = −l or t = −b. Asymmetric frustums are, for example, used

for stereo viewing (see Section 18.1.4) and in CAVEs [210].

4.6. Projections 95

The ﬁeld of view is an important factor in providing a sense of the scene.

The eye itself has a physical ﬁeld of view compared to the computer screen.

This relationship is

φ = 2 arctan(w/(2d)), (4.67)

where φ is the ﬁeld of view, w is the width of the object perpendicular

to the line of sight, and d is the distance to the object. For example,

a 21-inch monitor is about 16 inches wide, and 25 inches is a minimum

recommended viewing distance [27], which yields a physical ﬁeld of view

of 35 degrees. At 12 inches away, the ﬁeld of view is 67 degrees; at 18

inches, it is 48 degrees; at 30 inches, 30 degrees. This same formula can

be used to convert from camera lens size to ﬁeld of view, e.g., a standard

50mm lens for a 35mm camera (which has a 36mm wide frame size) gives

φ = 2 arctan(36/(2 ∗ 50)) = 39.6 degrees.

Using a narrower ﬁeld of view compared to the physical setup will lessen

the perspective eﬀect, as the viewer will be zoomed in on the scene. Setting

a wider ﬁeld of view will make objects appear distorted (like using a wide

angle camera lens), especially near the screen’s edges, and will exaggerate

the scale of nearby objects. However, a wider ﬁeld of view gives the viewer

a sense that objects are larger and more impressive, and has the advantage

of giving the user more information about the surroundings.

The perspective transform matrix that transforms the frustum into a

unit cube is given by Equation 4.68:

⎛

⎜

⎝

r − l

0 −

r + l

r − l

t − b

−

t + b

t − b

f + n

f − n

−

2fn

f − n

00 1 0

⎞

⎟

⎠

. (4.68)

After applying this transform to a point, we will get another point q =

)

.Thew-component, q

, of this point will (most often) be

nonzero and not equal to one. To get the projected point, p, we need to

divide by q

: p =(q

, 1)

.ThematrixP

always sees

to it that z = f maps to +1 and z = n maps to −1. After the perspective

transform is performed, clipping and homogenization (division by w)is

done to obtain the normalized device coordinates.

To get the perspective transform used in OpenGL, ﬁrst multiply with

S(1, 1, −1), for the same reasons as for the orthographic transform. This

simply negates the values in the third column of Equation 4.68. After

this mirroring transform has been applied, the near and far values are

The far plane can also be set to inﬁnity. See Equation 9.8 on page 345 for this form.

96 4. Transforms

entered as positive values, with 0 <n



, as they would traditionally

be presented to the user. However, they still represent distances along

the world’s negative z-axis, which is the direction of view. For reference

purposes, here is the OpenGL equation:

OpenGL

⎛

⎜

⎝



r − l

r + l

r − l



t − b

t + b

t − b

00−



+ n



− n



−



− n



00 −10

⎞

⎟

⎠

. (4.69)

Some APIs (e.g., DirectX) map the near plane to z = 0 (instead of

z = −1) and the far plane to z = 1. In addition, DirectX uses a left-

handed coordinate system to deﬁne its projection matrix. This means

DirectX looks along the positive z-axis and presents the near and far values

as positive numbers. Here is the DirectX equation:

p[0,1]

⎛

⎜

⎝



r − l

0 −

r + l

r − l



t − b

−

t + b

t − b



− n



−



− n



00 1 0

⎞

⎟

⎠

. (4.70)

DirectX uses row-major form in its documentation, so this matrix is nor-

mally presented in transposed form.

One eﬀect of using a perspective transformation is that the computed

depth value does not vary linearly with the input p

value. For example,

if n



=10andf



= 110 (using the OpenGL terminology), when p

60 units down the negative z-axis (i.e., the halfway point) the normalized

device coordinate depth value is 0.833, not 0. Figure 4.20 shows the eﬀect

of varying the distance of the near plane from the origin. Placement of

the near and far planes aﬀects the precision of the Z-buﬀer. This eﬀect is

discussed further in Section 18.1.2.

So, to test that this really works in the z-direction, we can multiply P

OpenGL

with

(0, 0, −n



, 1)

.Thez-component of the resulting vector will be −1. If we instead use

the vector (0, 0, −f



, 1)

,thez-component will be +1, as expected. A similar test can

be done for P

[0,1]

4.6. Projections 97

Figure 4.20. The eﬀect of varying the distance of the near plane from the origin. The

distance f



−n



is kept constant at 100. As the near plane becomes closer to the origin,

points nearer the far plane use a smaller range of the normalized device coordinate depth

space. This has the eﬀect of making the Z-buﬀer less accurate at greater distances.

Table of Contents for 4. Transforms (9/10)

Create new playlist

Sign In

Sign Up

Table of Contents for
4. Transforms (9/10)