Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 11

Introduction to Continuous-Time Stochastic Calculus

11.1 The Riemann Integral of Brownian Motion

11.1.1 The Riemann Integral

Let f be a real-valued function defined on [0, T]. We now recall the precise definition of the Riemann integral of f on [0, T] as follows.

For n ∈ ℕ, consider a partition Pn of the interval [0, T]:
$P_{n} = {t_{0}, t_{1}, ..., t_{n}}, 0 = t_{0} < t_{1} < \dots < t_{n} = T .$

Define Δti = ti − ti−1, i = 1, 2, . . . , n.
Introduce an intermediate partition Qn for the partition Pn:
$Q_{n} = {s_{1}, s_{2}, ..., s_{n}}, t_{i - 1} \leq s_{i} \leq t_{i}, i = 1, 2, ..., n .$
Define the Riemann (nth partial) sum as a weighted average of the values of f:
$S_{n} = S_{n} (f, P_{n}, Q_{n}) = \sum_{i = 1}^{n} f (s_{i}) Δ t_{i} .$
Suppose that the mesh size δ(Pn) ≔ max1≤i≤n Δti goes to zero as n → ∞. If the limit limn→∞ Sn exists and does not depend on the choice of partitions Pn and Qn, then this limit is called the Riemann integral of f on [0, T], denoted as usual by $\int_{0}^{T} f (t) d t .$ The function f is called the integrand. If the Riemann integral exists, then f is said to be Riemann integrable on [0, T]. For instance, if f is continuous on [0, T] (or the set of discontinuities of f is finite), then f is Riemann integrable on [0, T]. In fact, the Riemann integral exists if f is m–a.e. (i.e., Lebesgue almost everywhere) continuous on [0, T].

11.1.2 The Integral of a Brownian Path

Our goal is now to consider computing the Riemann integral of a Brownian sample path w.r.t. time t ∈ [0, T], i.e., for an outcome ω∈Ω:

$I (T, ω) ≔ \int_{0}^{T} W (t, ω) d t . (11.1)$

Recall that, with probability one (i.e., for almost all ω ∈ Ω), Brownian paths are continuous functions of time t. Hence, almost any sample path (t, W(t, ω)), 0 ≤ t ≤ T, is continuous. The Riemann integral (11.1) of such a sample path hence exists and is given by

$\int_{0}^{T} W (t, ω) d t = \lim_{δ (P_{n}) \to 0} \sum_{i = 1}^{n} W (s_{i}, ω) (t_{i} - t_{i - 1}) .$

It hence suffices to consider a uniform partition Pn of [0, T] with step size $Δ t = \frac{T}{n}$ and time points ti = i Δt for 0 ≤ i ≤ n. Let Qn be chosen so that si = ti, 1 ≤ i ≤ n. We then have the nth Riemann sum $S_{n} (ω) ≔ Δ t \cdot \sum_{i = 1}^{n} W (i Δ t, ω)$ and its limit converging to the Riemann integral of the Brownian path:

$I (T, ω) = \lim_{n \to \infty} S_{n} (ω) = \lim_{n \to \infty} \frac{T}{n} \sum_{i = 1}^{n} W (i \frac{T}{n}, ω)$

for almost all ω ∈ Ω. Hence, as a random variable, the Riemann integral of Brownian motion is (a.s.) uniquely given by

$I (T) \equiv \int_{0}^{T} W (t) d t = \lim_{n \to \infty} S_{n} = \lim_{n \to \infty} \frac{T}{n} \sum_{i = 1}^{n} W (i \frac{T}{n}) .$

We now show that the nth Riemann sum Sn is a normally distributed random variable.

Proposition 11.1.

The Riemann sum Sn is normally distributed with mean and variance

$E [S_{n}] = 0 a n d E [{(S_{n})}^{2}] = \frac{T (T + Δ t) (T + Δ t / 2)}{3} .$

Proof. Since Sn is a linear combination of jointly normal random variables, it is normally distributed. The expected value is

$E [S_{n}] = E [Δ t \cdot \sum_{i = 1}^{n} W (t_{i})] = Δ t \sum_{i = 1}^{n} E [W (t_{i})] = 0.$

Then, Var(Sn) = E[(Sn)2] is given by [note: ti ∧ tj = (Δt)(i ∧ j)]

$\begin{array}{l} E [{(S_{n})}^{2}] = E [{(Δ t \cdot \sum_{i = 1}^{n} W (t_{i}))}^{2}] = {(Δ t)}^{2} E [(\sum_{i = 1}^{n} W (t_{i})) (\sum_{j = 1}^{n} W (t_{j}))] \\ = {(Δ t)}^{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} E [W (t_{i}) W (t_{j})] = {(Δ t)}^{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} t_{i} \land t_{j} \\ = {(Δ t)}^{3} \sum_{i = 1}^{n} \sum_{j = 1}^{n} i \land j = {(Δ t)}^{3} \sum_{k = 1}^{n} k^{2} = {(Δ t)}^{3} \frac{n (n + 1) (2 n + 1)}{6} \\ = \frac{(Δ t \cdot n) (Δ t \cdot n + Δ t) (Δ t \cdot n + Δ t / 2)}{3} = \frac{T (T + Δ t) (T + Δ t / 2)}{3} \end{array}$

since Δt · n = T.

The Reimann sums {Sn}n ≥ 1 hence form a sequence of normally distributed random variables. The limit of such a sequence is hence normally distributed and this gives us that I(T) = limn→∞ Sn is a normal random variable. This is true for all time values T > 0. As n → ∞ (and hence Δt = T/n → 0), we obtain the mean and variance of I(T):

$E [S_{n}] \to E [I (T)] = 0, E [S_{n}^{2}] \to E [I^{2} (T)] = \frac{T^{3}}{3} .$

Thus, the stochastic process {I(t)}t>0 is a Gaussian process where I(t) ~ Norm(0, t3/3).

Alternatively, to obtain the moments of I(t) it is instructive to apply the Fubini Theorem that allows for changing the order of the time integral and expectation integral. For all t and s with 0 ≤ s ≤ t, we have

$\begin{array}{l} E [i (s) I (t)] = E [(\int_{0}^{s} W (u) d u) (\int_{0}^{t} W (v) d v)] = \int_{0}^{t} \int_{0}^{s} E [W (u) W (v)] d u d v \\ = \int_{0}^{t} \int_{0}^{s} \min {u, v} d u d v = \int_{0}^{s} \int_{0}^{s} \min {u, v} d u d v + \int_{s}^{t} (\int_{0}^{s} \min {u, v} d u) d v \\ = \int_{0}^{s} \int_{0}^{s} \min {u, v} d u d v + \int_{s}^{t} (\int_{0}^{s} u d u) d v \\ = \frac{s^{3}}{3} + (t - s) \frac{s^{2}}{2} . \end{array}$

The last line follows by computing each integral separately. The second integral follows trivially. The first integral is readily computed by writing min {u, v} = min {u, v} ?u≤v + min {u, v}?v≤u and by symmetry:

$\begin{array}{l} \int_{0}^{s} \int_{0}^{s} \min {u, v} d u d v & = \int_{0}^{s} \int_{0}^{s} \min {u, v} I_{u \leq v} d u d v + \int_{0}^{s} \int_{0}^{s} \min {u, v} I_{v \leq u} d v d u \\ = \int_{0}^{s} (\int_{0}^{v} u d u) d v + \int_{0}^{s} (\int_{0}^{u} v d v) d u \\ = 2 \int_{0}^{s} (\int_{0}^{v} u d u) d v = \int_{0}^{s} v^{2} d v = s^{3} / 3. \end{array}$

Hence, applying the above formula for s = t gives the variance $E [I^{2} (t)] = E [I (t) I (t)] = \frac{t^{3}}{3}$ . The mean function of the integral process I is zero, mI(t) = E[I(t)] = 0 and the covariance function is

$c_{I} (s, t) = E [I (s) I (t)] = \frac{{(s \land t)}^{3}}{3} + | t - s | \frac{{(s \land t)}^{2}}{2} .$

Example 11.1

Show that $Y (t) ≔ W^{3} (t) - 3 \int_{0}^{t} W (u) d u, t \geq 0$ , is a martingale w.r.t. any filtration for Brownian motion.

Solution. First we note that Y(t) ≔ W3(t) − 3I(t), where the integral $I (t) \equiv \int_{0}^{t} W (u) d u$ is a function of the history of the Brownian motion up to time t and is hence $ℱ_{t}$ –measurable. That is, the integral process {I(t)}t≥0 is adapted to the filtration and hence so is the process {Y(t)}t⩾0. The process is also integrable since $E [| Y (t) |] \leq E [| W^{3} (t) |] + 3 E [| I (t) |] = E [| W^{3} (t) |] + 3 \int_{0}^{t} E [| W (u) |] d u < \infty$ . Now, for times t, s ≥ 0 we consider $E [I (t + s) | ℱ_{t}] \equiv E_{t} [I (t + s)]$ :

$\begin{array}{l} E_{t} [I (t + s)] = E_{t} [I (t) + \int_{t}^{t + s} W (u) d u] = I (t) + E_{t} [\int_{t}^{t + s} W (u) d u] \\ = I (t) + \int_{t}^{t + s} E_{t} [W (u)] d u = I (t) + \int_{t}^{t + s} W (t) d u \\ = I (t) + W (t) \int_{t}^{t + s} d u = I (t) + s W (t), \end{array}$

where we used the martingale property of Brownian motion and Fubini's theorem in one of the terms. Note that in explicit integral form we have shown

$E_{t} [\int_{0}^{t + s} W (u) d u] = \int_{0}^{t} W (u) d u + s W (t) .$

Using the fact that {W3(t) − 3tW(t)}t≥0 and {W(t)}t≥0 are martingales, we obtain

$\begin{array}{l} E_{t} [W^{3} (t + s)] & = & E_{t} [W^{3} (t + s) - 3 (t + s) W (t + s)] + E_{t} [3 (t + s) W (t + s)] \\ = & W^{3} (t) - 3 t W (t) + 3 (t + s) W (t) = W^{3} (t) + 3 s W (t) . \end{array}$

Therefore,

$\begin{array}{l} E_{t} [Y (t + s)] & = & E_{t} [W^{3} (t + s)] - 3 E_{t} [I (t + s)] \\ = & W^{3} (t) + 3 s W (t) - 3 I (t) - 3 s W (t) = W^{3} (t) - 3 I (t) = Y (t) . \end{array}$

11.2 The Riemann–Stieltjes Integral of Brownian Motion

Since it is possible to integrate Brownian paths (and functions of Brownian motion) w.r.t. time, it is interesting to find out what other integrals can be calculated for Brownian motion. The Riemann–Stieltjes integral generalizes the Riemann integral. It provides an integral of one function w.r.t. another appropriate one. So our goal is to define the integral of one stochastic process w.r.t. another one (say, w.r.t. Brownian motion).

11.2.1 The Riemann–Stieltjes Integral

The construction of the Riemann–Stieltjes integral goes as follows. Let f be a bounded function and g be a monotonically increasing function, both defined on [0, T].

For n ∈ ℕ, introduce partitions Pn and Qn in the same manner as is done for the Riemann integral.
If the limit of the partial sum over any (shrinking) partition,
$\lim_{\begin{matrix} n \to \infty \\ δ (P_{n}) \to 0 \end{matrix}} \sum_{i = 1}^{n} f (s_{i}) (g (t_{i}) - g (t_{i - 1})),$

exists and is independent of the choice of Pn and Qn, then it is called the Riemann–Stieltjes integral of f w.r.t. g on [0, T] and is denoted by

$\int_{0}^{T} f (t) d g (t) .$

The function g is called the integrator.
By taking g(x) = x, the Riemann integral is simply seen to be a special case of the Riemann–Stieltjes integral.

Let us take a look at some important examples, as follows.

(1) Consider the Heaviside (unit) step function, H, defined by
$H (x) = I_{[0, \infty)} (x) = {\begin{matrix} 0 & if x < 0, \\ 1 & if x \geq 0. \end{matrix}$

Let f be continuous at an interior point s ∈ (0, T), c be a nonnegative constant, and let g(x) = cH(x − s). Then,

$\int_{0}^{T} f (t) d g (t) = c f (s) .$

Hence, when integrator g is a simple step function, the integral simply picks out one value of the continuous function f and this value corresponds to the value of f at the point of discontinuity of g. This is a sifting property of the step function integrator g.
(2) The first example extends into the more general case of a step function g(x) assumed as a mixture of Heaviside unit step functions: $g (x) = \sum_{n = 1}^{\infty} c_{n} H (x - s_{n})$ , where cn ≥ 0 for n = 1, 2, 3, . . . are chosen such that $\sum_{n = 1}^{\infty} c_{n}$ converges and {sn}n≥1 is a sequence of distinct points in (0, T). If f is continuous on [0, T], then
$\int_{0}^{T} f (t) d g (t) = \sum_{n = 1}^{\infty} c_{n} f (s_{n}) .$

We see that the integral is a sum over f evaluated at all points of discontinuity of g within the integration interval [0, T]. This extends the sifting property in the above first example.
(3) Suppose that f and g′ are Riemann integrable on [0, T]. In that case
$\int_{0}^{T} f (t) d g (t) = \int_{0}^{T} f (t) g' (t) d t .$

Hence, when the integrator is differentiable, the Riemann–Stieltjes integral is simply the Riemann integral of fg′, i.e., we formally have the differential dg(t) = g'(t) dt.
(4) Consider a CDF F that is a mixture of a discrete CDF F1 and a continuous CDF F2:
$F (x) = w_{1} F_{1} (x) + w_{2} F_{2} (x), F_{1} (x) = \sum_{n = 1}^{\infty} p_{n} H (x - x_{n}), F_{2} (x) = \int_{- \infty}^{x} p (t) d t,$

where w1 and w2 are nonnegative weights summing to one, {pn}n≥1 and {xn}n≥1 are, respectively, the mass probabilities and mass points of the discrete distribution, and p(x) = F′2(x)is the PDF of the continuous distribution. Then, for a bounded f:

$\begin{array}{l} \int_{- \infty}^{\infty} f (x) d F (x) = w_{1} \int_{- \infty}^{\infty} f (x) d F_{1} (x) + w_{2} \int_{- \infty}^{\infty} f (x) d F_{2} (x) \\ = w_{1} \sum_{n = 1}^{\infty} p_{n} f (x_{n}) + w_{2} \int_{- \infty}^{\infty} f (x) p (x) d x . \end{array}$

In the first integral, with F1 as integrator, we used the result in example (2). This is an example in which the Riemann–Stieltjes integral gives us the expected value of a function f(X) of a random variable X having a mixture distribution given by F1 and F2 with respective mixture probabilities (weights) w1 and w2. In particular, the above equation can be read as E[f(X)] = w1E(1)[f(X)] + w2E(2)[f(X)].

The Riemann–Stieltjes integral $\int_{0}^{T} f (t) d g (t)$ can be extended on a larger class of functions. Recall that the p−variation of f: [0, T] → ℝ is

$V_{[0, T]}^{(p)} (f) = \underset{δ (P_{n}) \to 0}{\lim \sup} \sum_{i = 1}^{n} {| f (t_{i}) - f (t_{i} - 1) |}^{p},$

where the limit is taken over all possible partitions 0 = t0 < t1 < · · · < tn = T, shrinking as n → ∞. The following result (stated without proof) shows that we can consider the Riemann–Stieltjes integral on a fairly extensive combination of functions f and integrator g whose combined variational properties satisfy a certain condition.

Proposition 11.2.

Assume that f and g do not have discontinuities at the same points within the integration interval [0, T]. Let the p-variation of f and the q-variation of g be finite for some p, q > 0, such that $\frac{1}{p} + \frac{1}{q} > 1$ . Then, the Riemann–Stieltjes integral $\int_{0}^{T} f (t) d g (t)$ exists and is finite.

For example, if the integrator g is a function of bounded variation on [0, T], and f is a continuous function, then both functions have finite (first) variation. Hence, we may use p = q = 1 in the above proposition and this confirms that the Riemann–Stieltjes integral of f w.r.t. g is defined.

11.2.2 Integrals w.r.t. Brownian Motion

It is known that (a.s.) the p-variation of a Brownian sample path on [0, T] is finite for p ≥ 2 and infinite for p < 2. In particular, we proved that the quadratic variation of Brownian motion is bounded but the first variation is unbounded. Applying Proposition 11.2 for q = 2 to the Riemann–Stieltjes integral $\int_{0}^{T} f (t) d W (t)$ gives us that such an integral w.r.t. Brownian motion is well-defined if the p-variation of f is finite for some p ∈ (0, 2). For example, the integral exists if f is a function of bounded variation (p = 1) such as a monotone function or a continuously differentiable function. Thus, for example, the integrals

$\int_{0}^{T} e^{t} d W (t), \int_{0}^{T} t^{α} d W (t) (α \geq 1)$

exist as Riemann–Stieltjes integrals. However, the integral

$\int_{0}^{T} W (t) d W (t) (11.2)$

does not (a.s.) exist as a Riemann–Stieltjes integral. First, note that Proposition 11.2 is not applicable to the integral in (11.2) since $V_{[0, T]}^{(p)} (W)$ is finite iff p ≥ 2. Hence, or f(t) = g(t) = W(t) we have p = q and $\frac{1}{p} + \frac{1}{p} \leq$ 1 for p ≥ 2. Second, let us show we can obtain different values of the integral in (11.2) for different intermediate partitions Qn.

Consider the Riemann–Stieltjes sum $S_{n} = \sum_{i = 1}^{n} W (t_{i - 1}) (W (t_{i}) - W (t_{i - 1}))$ , i.e., with the intermediate nodes si = ti−1 for i = 1, 2, . . . , n. We rewrite Sn as follows, upon using the algebraic identity $a (b - a) = - \frac{1}{2} {(a - b)}^{2} + \frac{1}{2} (b^{2} - a^{2})$ :

$\begin{array}{l} S_{n} = - \frac{1}{2} \sum_{i = 1}^{n} {{(W (t_{i}) - W (t_{i - 1}))}^{2} - (W^{2} (t_{i}) - W^{2} (t_{i - 1}))} \\ = - \frac{1}{2} \sum_{i - = 1}^{n} {(W (t_{i}) - W (t_{i - 1}))}^{2} + \frac{1}{2} \underset{= W^{2} (t_{n}) - W^{2} (t_{0}) = W^{2} (T) - W^{2} (0)}{\underset{︸}{\sum_{i = 1}^{n} (W^{2} (t_{i}) - W^{2} (t_{i - 1}))}} . \end{array}$

As n → ∞ and δ(Pn) → 0, we have $\sum_{i = 1}^{n} {(W (t_{i}) - W (t_{i - 1}))}^{2} \to [W, W] (T)$ = T, i.e., the quadratic variation of Brownian motion. Therefore,

$\lim_{n \to \infty} S_{n} = \frac{1}{2} (W^{2} (T) - W^{2} (0)) - \frac{T}{2} .$

This limit is called the Itô integral of BM:

$\int_{0}^{T} W (t) d W (t) = \frac{1}{2} (W^{2} (T) - W^{2} (0)) - \frac{T}{2} = \frac{1}{2} (W^{2} (T) - T) .$

Recall that for a differentiable function f we would have obtained (the ordinary calculus result)

$\int_{0}^{T} f (t) d f (t) = \frac{f^{2} (T) - f^{2} (0)}{2} .$

Consider another choice of the intermediate partition with upper endpoint si = ti for every i = 1, 2, . . . , n. The Riemann–Stieltjes sum is then

$\begin{array}{l} S_{n}^{*} & = & \sum_{i = 1}^{n} W (t_{i}) (W (t_{i}) - W (t_{i - 1})) \\ = & \frac{1}{2} \underset{\to [W, W] (T) = T, as n \to \infty}{\underset{︸}{\sum_{i = 1}^{n} {(W (t_{i}) - W (t_{i - 1}))}^{2}}} + \frac{1}{2} \underset{= W^{2} (T) - W^{2} (0)}{\underset{︸}{\sum_{i = 1}^{n} (W^{2} (t_{i}) - W^{2} (t_{i - 1}))}} \\ \to & \frac{1}{2} (W^{2} (T) - W^{2} (0)) + \frac{T}{2}, as n \to \infty . \end{array}$

For 0 ≤ α ≤ 1, consider a weighted average of Sn and $S_{n}^{*}$ :

$\begin{array}{l} α S_{n} + (1 - α) S_{n}^{*} & = & \sum_{i = 1}^{n} (α W (t_{i - 1}) + (1 - α) W (t_{i})) (W (t_{i}) - W (t_{i - 1})) \\ \to & \frac{1}{2} (W^{2} (T) - W^{2} (0)) + \frac{T}{2} - α T, as n \to \infty . \end{array}$

An interesting case is when the midpoint is used, i.e., $α = \frac{1}{2}$ . The respective limit is called the Stratonovich integral of BM:

$\int_{0}^{T} W (t) \circ d W (t) = \frac{1}{2} (W^{2} (T) - W^{2} (0)) .$

The Stratonovich integral satisfies the usual rules of (nonstochastic) ordinary calculus such as the chain rule and integration by parts. The two types of stochastic integrals are related as

$\int_{0}^{T} W (t) \circ d W (t) = \int_{0}^{T} W (t) d W (t) + \frac{T}{2} .$

For a continuously differentiable function f : ℝ → ℝ, it can be shown that the following conversion formula applies:

$\int_{0}^{T} f (W (t)) \circ d W (t) = \int_{0}^{T} f (W (t)) d W (t) + \frac{1}{2} \int_{0}^{T} f^{'} (W (t)) d t,$

where the respective integrals correspond to the Stratonovich and Itô integrals of a differentiable function f(W(t)) of BM. We note, however, that we have yet to give a precise general definition of such stochastic integrals. This is the topic of the next section.

11.3 The Itô Integral and Its Basic Properties

11.3.1 The Itô Integral for Simple Processes

Our goal is to give a construction of the Itô stochastic integral w.r.t. standard Brownian motion. Generally, we shall assume that the integrand is some other stochastic process that is adapted to a chosen filtration $F = {ℱ_{t}}_{t \geq 0}$ for Brownian motion. We can, for instance, choose ? as the natural filtration for Brownian motion, $F^{W} = {ℱ_{t}^{W}}_{t \geq 0}$ . In what follows, we shall consider all processes defined on some time interval [0, T], for any T > 0, and we begin by considering a simple case with a piecewise-constant integrand.

Definition 11.1.

A continuous-time stochastic process {C(t)}0≤t≤T defined on the filtered probability space (Ω, ℱ, ℙ, ?) is said to be a simple process (or step-stochastic process) if there exists a time partition Pn = {t0, t1, . . . , tn} of [0, T], where t0 = 0 and tn = T, such that the process C is constant on each subinterval [ti, ti+1), 0 ≤ i ≤ n − 1. In other words, there exists random variables ξ0, ξ1, . . . , ξn−1 such that:

ξi is $ℱ_{t_{i}}$ -measurable for i = 0, 1, . . . , n − 1 (i.e., the process C is adapted to ?);
C(t) = $\sum_{i = 0}^{n - 1} ξ_{i} I_{[t_{i}, t_{i + 1})} (t), i .e ., C (t) = ξ_{i} for t \in [t_{i}, t_{i + 1})$ .

The simple process C is said to be square integrable if

$E [\int_{0}^{T} C^{2} (s) d s] < \infty \Leftrightarrow E [ξ_{i}^{2}] < \infty for i = 0, 1, ..., n - 1.$

The process {C(t)}0≤t≤T is defined as a right-continuous step process. For instance, a piecewise-constant approximation of Brownian motion is the simple process:

$C (t) = ξ_{i} \equiv W (t_{i}) for t \in [t_{i}, t_{i + 1}) .$

Note that the process C(t) on the interval t ∈ [ti, ti+1) is fixed to BM at time ti (it is a Norm(0, ti) random variable). For any path ω, the graph of C(t, ω) as function of time t ∈ [0, T] is piecewise constant (step function) with fixed value C(t, ω) = W(ti, ω) ≡ ξi (ω) on every time interval t ∈ [ti, ti+1). Figure 11.1 depicts a sample path of BM and an approximation to it by the path of a simple process on the interval [0, 1].

Figure 11.1

Figure showing a Brownian sample path and its approximation by a simple process.

A Brownian sample path and its approximation by a simple process.

The Itô integral of a simple process can be defined as a Riemann–Stieltjes sum evaluated at the left endpoint of subintervals [ti, ti+1). Hence, the simplest case of an Itô integral of an indicator function ?[a, b](t), 0 ≤ a < b ≤ T, is the Riemann–Stieltjes integral w.r.t. W:

$\int_{0}^{T} I_{[a, b]} (t) d W (t) = \int_{a}^{b} d W (t) = W (b) - W (a) .$

The Riemann–Stieltjes integral of a step function (i.e., a linear combination of indicator functions) gives us the working definition of the Itô integral of a simple process as follows.

Definition 11.2.

The Itô integral I(t) of a simple process $C (s) = \sum_{i = 0}^{n - 1} ξ_{i} I_{[t_{i}, t_{i + 1})} (s)$ on any interval [0, t], 0 ≤ t ≤ T, is

$I (t) \equiv \int_{0}^{t} C (s) d W (s) = \sum_{i = 0}^{k - 1} ξ_{i} (W (t_{i + 1}) - W (t_{i})) + ξ_{k} (W (t) - W (t_{k})), (11.3)$

for tk ≤ t ≤ tk+1. For the integration interval [0, T], we obtain

$I (T) = \int_{0}^{T} C (s) d W (s) = \sum_{i = 0}^{n - 1} ξ_{i} (W (t_{i + 1}) - W (t_{i})) .$

The Itô integral of a general process X ≡ {X(t), t ≥ 0} adapted to a filtration ? for BM is defined as the mean-square limit of integrals of simple processes that approximate X. Consider a square-integrable1 continuous-time process {X(t)}0≤t≤T adapted to ?:

$E [\int_{0}^{T} X^{2} (t) d t] < \infty and X (t) is ℱ_{t} -measurable for 0 \leq t \leq T .$

The process X on [0, T] can be approximated by a sequence of simple processes as follows:

select a partition Pn = {t0, t1 , . . . , tn} of [0, T], e.g., $t_{i} = i \frac{T}{n}$ for i = 0,1, . . . , n;
set ξi = X(ti) for i = 0, 1, . . . , n − 1;
set $C^{(n)} (t) = \sum_{i = 0}^{n - 1} ξ_{i} I_{[t_{i}, t_{i + 1})} (t)$ .

As the maximum step size δ(Pn) goes to 0 as n → ∞, the sequence of simple processes {C(n)(t)}n≥1 gives a better and better approximation of the continuously varying process X. The precise convergence condition is specified by requiring that

$\lim_{n \to \infty} E [\int_{0}^{T} (X (t) - C^{(n)} {(t)}^{2}) d t] = 0. (11.4)$

Given an adapted process X satisfying the above square integrability condition, it can be proven that there exists such a sequence of square-integrable and adapted simple processes such that (11.4) holds. The corresponding sequence is said to approximate the process X. Then, the Itô integral I(t), 0 ≤ t ≤ T, of a general process X is defined as the mean-square limit of integrals of an approximating sequence of simple processes:

$I (t) \equiv \int_{0}^{t} X (s) d W (s) ≔ \lim_{n \to \infty} \int_{0}^{t} C^{(n)} (s) d W (s) \equiv \lim_{n \to \infty} I^{(n)} (t) . (11.5)$

The above mean-square limit really means that the sequence of Itô integral random variables {I(n)(t)}n≥1 converges to the random variable I(t) in the sense of L2(Ω), i.e., for each t ≥ 0 we have

$\lim_{n \to \infty} E [{(I (t) - I^{(n)} (t))}^{2}] = 0. (11.6)$

The assumed square integrability condition on X ensures that I(t) exists and is given uniquely (a.s.). That is, for any approximating sequence satisfying the condition in (11.4), it can be shown that E[(I(m)(t) − I(n)(t))2] → 0, as m, n → ∞. This implies that {I(n)(t)}n≥1 is a Cauchy sequence in L2(Ω) and therefore has a unique limit in the L2 sense.

11.3.2 Properties of the Itô Integral

The Itô integral $I (t) \equiv I_{X} (t) ≔ \int_{0}^{t} X (s) d W (s)$ of a continuous-time stochastic process {X(t)}t≥0, which is adapted to a filtration ? = {ℱt}t≥0 for Brownian motion and assumed to satisfy the square-integrability condition $E [\int_{0}^{T} X^{2} (t) d t] < \infty$ , has the following properties.

(1) Continuity. Sample paths {I(t; ω)}0≤t≤T are continuous functions of time t (a.s.).
(2) Adaptivity. I(t) is ℱt-measurable for all t ∈ [0, T].
(3) Linearity. Let $I_{1} (t) = \int_{0}^{t} X_{1} (s) d W (s)$ and $I_{2} (t) = \int_{0}^{T} X_{2} (s) d W (s)$ and assume that processes X1 and X2 meet the same requirements as those specified for process X above. Then, $c_{1} I_{1} (t) + c_{2} I_{2} (t) = \int_{0}^{t} (c_{1} X_{1} (s) + c_{2} X_{2} (s)) d W (s)$ for constants c1, c2 ∈ ℝ.
(4) Martingale. {I(t)}0≤t≤T is a martingale w.r.t. filtration ?.
(5) Zero mean. E[I(t)] = 0 for 0 ≤ t ≤ T.
(6) Itô isometry. $Var (I (t)) = E [I^{2} (t)] = E [{(\int_{0}^{t} X (s) d W (s))}^{2}] = \int_{0}^{t} E [X^{2} (s)] d s,$ i.e., the variance of an Itô integral on [0, t] is equal to a Riemann integral (w.r.t. time variable s) of the second moment of the integrand process as a function of time s ∈ [0, t].

Proof.

To simplify the proof, we suppose that {X(t), t ≥ 0} is a simple process having the form $X (t) = \sum_{i = 0}^{n - 1} ξ_{i} I_{[t_{i}, t_{i + 1}]} (t)$ with each ξi as $ℱ_{t_{i}}$ -measurable.

(1) Fix ω ∈ Ω. Then I(t, ω) is a Riemann–Stieltjes integral of a piecewise-constant step function X(s, ω) with respect to the continuous integrator function W(s, ω) on the interval s ∈ [0, t]. Such an integral is a continuous function of the upper limit t.
(2) Let t ∈ [tk, tk+1]. Then, it is clear from the expression in (11.3) that I(t) is ℱt-measurable since it is a function of only Brownian motions up to time t and all ξi, 0 ≤ i ≤ k, random variables are $ℱ_{t_{i}}$ -measurable and hence ℱt-measurable.
(3) Suppose that X1 and X2 are defined on the same partition Pn (otherwise we combine the partitions for X1 and X2) and are given by:
$X_{j} (t) = \sum_{i = 0}^{n - 1} ξ_{i}^{(j)} I_{[t_{i}, t_{i + 1})} (t), j = 1, 2.$

Then, $c_{1} X_{1} (t) + c_{2} X_{2} (t) = \sum_{i = 0}^{n - 1} (c_{1} ξ_{i}^{(1)} + c_{2} ξ_{i}^{(2)}) I_{[t_{i}, t_{i} + 1)} (t)$ is a simple process. Integrate it on [0, T] to obtain

$\begin{array}{l} \int_{0}^{T} (c_{1} X_{1} (s) + c_{2} X_{2} (s)) d W (s) = \sum_{i = 1}^{n} (c_{1} ξ_{i - 1}^{(1)} + c_{2} ξ_{i - 1}^{(2)}) (W (t_{i}) - W (t_{i - 1})) \\ = c_{1} \sum_{i = 1}^{n} ξ_{i - 1}^{(1)} (W (t_{i}) - W (t_{i - 1})) + c_{2} \sum_{i = 1}^{n} ξ_{i - 1}^{(2)} (W (t_{i}) - W (t_{i - 1})) \\ = c_{1} \int_{0}^{T} X_{1} (s) d W (s) + c_{2} \int_{0}^{T} X_{2} (s) d W (s) . \end{array}$
(4) Fix s and t such that 0 ≤ s ≤ t ≤ T. Let us show that E[I(t) | ℱs] = I(s). Suppose that s ∈ [tm−1, tm] and t ∈ [tk−1, tk] for some 1 ≤ m ≤ k ≤ n. Represent the integral of X on [0, t] as follows:
$\begin{array}{l} I (t) & = & \underset{= I (s) is ℱ_{s} -measurable}{\underset{︸}{\sum_{i = 0}^{m - 2} ξ_{i} (W (t_{i + 1}) - W (t_{i})) + ξ_{m - 1} (W (s) - W (t_{m - 1}))}} \\ + & ξ_{m - 1} (W (t_{m}) - W (s)) + \sum_{j = m}^{k - 2} ξ_{j} (W (t_{j + 1}) - W (t_{j})) + ξ_{k - 1} (W (t) - W (t_{k - 1})) . \end{array}$

By taking the expectation of I(t) conditional on ℱs and applying properties of conditional expectations, we obtain

$\begin{array}{l} E_{s} [I (t)] & = & I (s) + \underset{(since ξ_{m-1} is ℱ_{s} -measurable)}{\underset{︸}{ξ_{m - 1} E_{s} [W (t_{m}) - W (s)]}} \\ + & \sum_{j = m}^{k - 2} E_{s} [ξ_{j} (W (t_{j + 1}) - W (t_{j}))] + E_{s} [ξ_{k - 1} (W (t) - W (t_{k - 1}))] . \end{array}$

Since W(tm) − W(s) is independent of ℱs, we have

$E_{s} [W (t_{m}) - W (s)] = E [W (t_{m}) - W (s)] = 0.$

By using the tower property and the independence property, we obtain

$\begin{array}{l} E_{s} [ξ_{j} (W (t_{j + 1}) - W (t_{j}))] = E_{s} [ξ_{j} E_{t_{j}} [W (t_{j + 1}) - W (t_{j})]] \\ = E_{s} [ξ_{j} E [W (t_{j + 1}) - W (t_{j})]] = 0 \end{array}$

for j = m + 1, . . . , k − 2. A similar step can be applied to the last expectation $E_{s} [ξ_{k - 1} E_{t_{k - 1}} [W (t) - W (t_{k - 1})]] = E_{s} [ξ_{k - 1} E [W (t) - W (t_{k - 1})]] = 0.$ Therefore, we have Es [I(t)] = I(s) for 0 ≤ s ≤ t ≤ T. Since the Itô integral is adapted to ? and is assumed integrable, E[|I(t)|] < ∞, then it is a martingale w.r.t. ?.
(5) Since the integral process I is a martingale, the expected value E[I(t)] is constant and equal to E[I(0)] = 0 for all 0 ≤ t ≤ T. We note that the martingale property also gives us the identity
$E [\int_{s}^{t} X (u) d W (u) | ℱ_{s}] = E [I (t) - I (s) | ℱ_{s}] = E [I (t) | ℱ_{s}] - I (s) = I (s) - I (s) = 0.$
(6) For ease of presentation, we assume that t = tk, for some k = 1, . . . , n, since the proof follows similarly for any value of t ≥ 0. Then,
$I (t_{k}) = \sum_{i = 0}^{k - 1} ξ_{i} (W (t_{i + 1}) - W (t_{i})) = \sum_{i = 0}^{k - 1} ξ_{i} Z_{i},$

where Zi ≡ W(ti+1) − W(ti) ~ Norm(0, ti+1 − ti), i = 0, . . . , k − 1, are i.i.d. random variables. Note that each Zi is independent of ξi since ξi is $ℱ_{t_{i}}$ -measurable and the Brownian increment W(ti+1) − W(ti) is independent of $ℱ_{t_{i}}$ . Squaring I(tk) and taking its expectation gives

$E [I^{2} (t_{k})] = \sum_{i = 0}^{k - 1} E [ξ_{i}^{2} Z_{i}^{2}] + 2 \underset{0 \leq i < j \leq k - 1}{\sum \sum} E [ξ_{i} ξ_{j} Z_{i} Z_{j}] .$

The second summation involves expectations of products with i < j, i.e., j ≥ i + 1, and hence ξi, ξj and Zi are $ℱ_{t_{j}}$ -measurable. Applying the tower property by conditioning on $ℱ_{t_{j}}$ then gives

$\begin{array}{l} E [ξ_{i} ξ_{j} Z_{i} Z_{j}] = E [E [ξ_{i} ξ_{j} Z_{i} Z_{j} | ℱ_{t_{j}}]] (ξ_{i} ξ_{j} Z_{i} is ℱ_{t_{j}} -measurable) \\ = E [ξ_{i} ξ_{j} Z_{i} E [Z_{j} | ℱ_{t_{j}}]] = E [ξ_{i} ξ_{j} Z_{i} E [Z_{j}]] = 0. \end{array}$

Here we used the fact that Zj ≡ W(tj+1) − W(tj) is independent of $ℱ_{t_{j}}$ and E[Zj] = 0. [We note that the result is also more simply derived by using the independence of Zj and ξiξjZi.] Therefore, the above second summation is zero and we have from the first sum (upon using the independence of ξi and Zi):

$\begin{array}{l} E [I^{2} (t)] & = & \sum_{i = 0}^{k - 1} E [ξ_{i}^{2} Z_{i}^{2}] = \sum_{i = 0}^{k - 1} E [ξ_{i}^{2}] E [Z_{i}^{2}] \\ = & \sum_{i = 0}^{k - 1} E [ξ_{i}^{2}] (t_{i + 1} - t_{i}) = \int_{0}^{t_{k}} E [X^{2} (s)] d s = E [\int_{0}^{t_{k}} X^{2} (s) d s], \end{array}$

where we used the step function $E [X^{2} (s)] = \sum_{i = 0}^{k - 1} E [ξ_{1}^{2}] I_{[t_{i}, t_{i + 1})} (s), for 0 \leq s \leq t_{k}$ .

It is important to note that not all properties that are valid for Riemann integrals are necessarily true for Itô integrals. For example, suppose that two processes X and Y satisfy X(t) ≤ Y(t) (a.s.), i.e., ℙ(X(t) ≤ Y(t)) = 1, for 0 ≤ t ≤ T. Then, it is true that $\int_{0}^{t} X (s) d s \leq \int_{0}^{t} Y (s) d s$ (a.s.) for 0 ≤ t ≤ T. However, this type of integral inequality property is not generally valid or Itô integrals IX(t) = $\int_{0}^{t} X (s) d W (s)$ and $I_{Y} (t) = \int_{0}^{t} Y (s) d W (s) .$ For example, consider the trivial case of constant processes X(t) ≡ 0 and Y(t) ≡ 1. Clearly, ℙ(X(t) ≤ Y(t)) = ℙ(0 ≤ 1) = 1. However, IX(t) ≡ 0 and $I_{Y} (t) = \int_{0}^{t} d W (s) = W (t)$ so that ℙ(IX (t) ≤ IY (t)) = ℙ(0 ≤ W (t)) = 1/2 ≠ 1.

Example 11.2

Show whether or not the following integrals are well-defined.

(a) $\int_{0}^{1} e^{W (t)} d W (t) .$
(b) $\int_{0}^{1} W (t + 1) d W (t) .$
(c) $\int_{0}^{t} e^{W^{2} (s)} d W (s) for t \geq 0.$ for t ≥ 0.
(d) $\int_{0}^{1} {(1 - t)}^{- α} d W (t) for α \in ℝ .$ for a ∈ ℝ.

Solution.

(a) For 0 ≤ t ≤ 1, the integrand eW(t) is ℱt-measurable. So the integrand process, X(t) ≔ eW(t), is adapted to a filtration for Brownian motion. Now, we check if the integrand is square integrable:
$\int_{0}^{1} E [e^{2 W (t)}] d t = \int_{0}^{1} e^{2 t} d t = \frac{e^{2} - 1}{2} < \infty .$

Therefore, the integral is defined.
(b) Note, in this case the integrand process X(t) ≔ W(t + 1) is ℱt+1-measurable, but not ℱt-measurable and hence the Itô integral is not defined.
(c) First, find the second moment of the integrand:
$E [e^{2 W^{2} (s)}] = \int_{- \infty}^{\infty} e^{2 s z^{2}} n (z) d z = \int_{- \infty}^{\infty} \frac{1}{\sqrt{2 π}} e^{- (\frac{1}{2} - 2 s) z^{2}} d z$

and this has finite value $\frac{1}{\sqrt{1 - 4 s}} iff s < \frac{1}{4}$ . Now, the integral of $E [e^{2 W^{2} (s)}]$ on s ∈ [0, t] is finite iff $0 \leq t \leq \frac{1}{4}$ . So the Itô integral is defined for $t \in [0, \frac{1}{4}]$ .
(d) Note that the integrand X(t) = (1 − t)−a is just an ordinary function of time t, and $\int_{0}^{1} E [X^{2} (t)] d t = \int_{0}^{1} {(1 - t)}^{- 2 a} d t < \infty iff a < \frac{1}{2}$ . So the Itô integral is defined iff $a < \frac{1}{2}$ .

Before discussing further properties of the Itô integral, we now present a useful formula for computing the covariance between two Itô integrals (w.r.t. the same Brownian motion) as follows by Itô isometry. In particular, let X and Y be two adapted processes such that each satisfies the square integrability condition, i.e., assume $\int_{0}^{t} E [X^{2} (s)] d s < \infty$ and $\int_{0}^{t} E [Y^{2} (s)] d s < \infty$ . Then, $I_{X} (t) ≔ \int_{0}^{t} X (s) d W (s)$ and $I_{Y} (t) ≔ \int_{0}^{t} Y (s) d W (s)$ have covariance

$E [I_{X} (t) I_{Y} (t)] \equiv E [\int_{0}^{t} X (s) d W (s) \int_{0}^{t} Y (s) d W (s)] = \int_{0}^{t} E [X (s) Y (s)] d s . (11.7)$

Note that the Itô integrals have zero mean, E[IX(t)] = E[IY(t)] = 0. Hence, their covariance Cov(IX(t), IY(t)) = E[IX(t)IY(t)]. The formula in (11.7) is readily proven by writing the product $I_{X} I_{Y} = \frac{1}{2} {(I_{X} + I_{Y})}^{2} - \frac{1}{2} I_{X}^{2} - \frac{1}{2} I_{Y}^{2} = \frac{1}{2} I_{X + Y}^{2} - \frac{1}{2} I_{X}^{2} - \frac{1}{2} I_{Y}^{2}$ . Using linearity of expectations and applying Itô isometry three times gives the result:

$\begin{array}{l} E [I_{X} (t) I_{Y} (t)] & = & \frac{1}{2} (E [I_{X + Y}^{2} (t)] - E [I_{X}^{2} (t)] - E [I_{Y}^{2} (t)]) \\ = & \frac{1}{2} (\int_{0}^{t} E [{(X (s) + Y (s))}^{2}] d s - \int_{0}^{t} E [X^{2} (s)] d s - \int_{0}^{t} E [Y^{2} (s)] d s) \\ = & \int_{0}^{t} E [\frac{1}{2} {(X (s) + Y (s))}^{2} - \frac{1}{2} X^{2} (s) - \frac{1}{2} Y^{2} (s)] d s = \int_{0}^{t} E [X (s) Y (s)] d s . \end{array}$

The result in (11.7) also leads to a formula for the covariance, Cov(IX(t), IY(u)) = E[IX(t)IY(u)], between two Itô integrals at different times, 0 ≤ t ≤ u:

$E [I_{X} (t) I_{Y} (u)] \equiv E [\int_{0}^{t} X (s) d W (s) \int_{0}^{u} Y (s) d W (s)] = \int_{0}^{t} E [X (s) Y (s)] d s . (11.8)$

This follows from the martingale property of an Itô integral and by conditioning on ℱt, with IX(t) as ℱt-measurable, while using the tower property:

$E [I_{X} (t) I_{Y} (u)] = E [E [I_{X} (t) I_{Y} (u) | ℱ_{t}]] = E [I_{X} (t) E [I_{Y} (u) | ℱ_{t}]] = E [I_{X} (t) I_{Y} (t)] .$

11.4 Itô Processes and Their Properties

11.4.1 Gaussian Processes Generated by Itô Integrals

The Itô integral of a nonrandom (ordinary) differentiable function f can be considered as a Riemann–Stieltjes integral with any path of Brownian motion acting as the integrator function w.r.t. time. Thus it can be reduced to a Riemann integral by using the integration by parts formula:

$I (t) = \int_{0}^{t} f (s) d W (s) = f (t) W (t) - f (0) W (0) - \int_{0}^{t} f^{'} (s) W (s) d s .$

We know that the Riemann integral of Brownian motion is a Gaussian process. In a similar way, one can prove that the integral $\int_{0}^{t} f^{'} (s) W (s) d s$ is a Gaussian process as well (being considered as a function of the upper limit t). Thus, I(t) is a Gaussian process as well. This result is proved below for a more general case.

Theorem 11.3.

Let f: [0, ∞) → ℝ be a nonrandom function such that $\int_{0}^{T} f^{2} (t) d t < \infty$ for some T > 0. Then, the Itô integral $I (t) = \int_{0}^{t} f (s) d W (s), 0 \leq t \leq T,$ is a Gaussian process with mean zero and covariance function given by

$c_{I} (t, s) ≔ Cov (I (t), I (s)) = \int_{0}^{t \land s} f^{2} (u) d u, 0 \leq t, s \leq T .$

Proof. It suffices to show the property for 0 ≤ s ≤ t ≤ T where t ∧ s = s. By the zero mean property of Itô integrals, we have mI(t) = E[I(t)] ≡ 0. For the covariance function cI(t, s) we have, upon using the tower property and martingale property of the Itô integral,

$\begin{array}{l} c_{I} (t, s) & \equiv & E [\int_{0}^{t} f (u) d W (u) \int_{0}^{s} f (u) d W (u)] \\ = & E [I (t) I (s)] = E [I (s) E [I (t) | F_{s}]] = E [I^{2} (s)] . \end{array}$

This expectation is evaluated by the Itô isometry formula:

$E [I^{2} (s)] \equiv E [(\int_{0}^{s} f (u) d W {(u)}^{2})] = \int_{0}^{s} E [f^{2} (u)] d u = \int_{0}^{s} f^{2} (u) d u$

where f(u) is nonrandom. Finally, by using the Itô formula presented in the next subsection, we can obtain the moment generating function of I(t):

$M_{I (t)} (α) ≔ E [e^{α I (t)}] = e^{\frac{1}{2} α^{2} f_{0}^{t} f^{2} (s) d s}, \forall α \in ℝ .$

This is the unique moment generating function of a normal random variable with mean zero and variance $\int_{0}^{t} f^{2} (s) d s$ . Therefore, $I (t) ~ N o r m (0, \int_{0}^{t} f^{2} (s) d s) .$

It should be remarked that the above result can be stated as

$\int_{0}^{t} f (s) d W (s) ~ N o r m (0, \int_{0}^{t} f^{2} (s) d s) \overset{d}{=} W (g (t))$

where g is a function of time t as defined by $g (t) ≔ \int_{0}^{t} f^{2} (s) d s .$ That is, the Itô integral of the ordinary function f on [0, t] has the same distribution as standard Brownian motion at a time given by g(t). This is a simple type of time-changed Brownian motion where in this case the time change, g(t), is an ordinary function of time t.

Example 11.3

The process $X (t) = \int_{0}^{t} s d W (s)$ is a Gaussian process with mean zero and variance Var $(X (t)) = \int_{0}^{t} s^{2} d s = t^{3} / 3, i .e ., X (t) \overset{d}{=} W (g (t))$ where g(t) = t3/3.

11.4.2 Itô Processes

The sum of an Itô integral of a stochastic process and an ordinary (Riemann) integral generates another stochastic process called an Itô process.

Definition 11.3.

Let {µ(t)}t≥0 and {σ(t)}t≥0 be adapted to a filtration {ℱt}t≥0 for standard Brownian motion and satisfying

$\int_{0}^{T} E [| μ (t) |] d t < \infty and \int_{0}^{T} E [σ^{2} (t)] d t < \infty .$

Then, the process

$X (t) = X_{0} + \int_{0}^{t} μ (s) d s + \int_{0}^{t} σ (s) d W (s) (11.9)$

is well-defined for 0 ≤ t ≤ T. It is called an Itô process. The processes {µ(t)}t≥0 and {σ(t)}t≥0 are respectively called the drift coefficient process and the diffusion or volatility coefficient process.

The Itô process X can also be described by its so-called stochastic differential equation (SDE) which is obtained by “formally differentiating” (11.9) w.r.t. the time parameter t:

$d X (t) = μ (t) d t + σ (t) d W (t) . (11.10)$

We note that this SDE, along with the initial condition X(0) = X0, is a shorthand way of writing the stochastic integral equation in (11.9). We interpret (11.10) through (11.9), where the latter has proper mathematical meaning as a sum of a Riemann integral and an Itô stochastic integral. That is, the Itô process X ≡ {X(t)}t≥0 can be viewed as a solution to the SDE in (11.10) with the initial condition X(0) = X0. The differential representation in (11.10) only has rigorous mathematical meaning by way of the respective integral representations in (11.9).

Some examples of Itô processes are as follows.

(a) Let X(0) = x0, µ(t) ≡ µ and σ(t) ≡ σ be constants. Then, we obtain a drifted BM (i.e., BM with constant drift µ):
$X (t) = x_{0} + \int_{0}^{t} μ d s + \int_{0}^{t} σ d W (s) = x_{0} + μ t + σ W (t) .$
(b) Let µ = µ(t, X(t)) and σ = σ(t, X(t)) be functions of both time t and the process value X(t) at time t. The Itô process implicitly defined by the stochastic integral equation
$X (t) = X_{0} + \int_{0}^{t} μ (s, X (s)) d s + \int_{0}^{t} σ (s, X (s)) d W (s)$

is called a diffusion process.
(c) Let µ(t) and σ(t) be nonrandom (ordinary) functions of time t. Then,
$X (t) = X_{0} + \int_{0}^{t} μ (s) d s + \int_{0}^{t} σ (s) d W (s), t \geq 0,$

with constant X0 ∈ ℝ, is a Gaussian process with mean and covariance functions

$m_{X} (t) = X_{0} + \int_{0}^{t} μ (u) d u and c_{X} (t, s) = \int_{0}^{t \land s} σ^{2} (u) d u .$

The Itô process defined in (11.9) is given by a sum of a Riemann integral of µ and an Itô integral of σ. Both integrals being considered as functions of the upper limit t have continuous sample paths. Therefore, the Itô process has continuous sample paths as well.

So far, we have defined the Itô process as a stochastic integral w.r.t. Brownian motion. More generally, we can also define a stochastic integral w.r.t. an Itô process. Let the process {Y(t)}t≥0 be adapted to a filtration for BM. We define the stochastic integral of Y w.r.t. the Itô process X, defined in (11.9), as follows:

$\int_{0}^{t} Y (s) d X (s) ≔ \int_{0}^{t} Y (s) μ (s) d s + \int_{0}^{t} Y (s) σ (s) d W (s), t \geq 0.$

Note that this is like substituting the stochastic differential dX(s) = µ(s) ds + σ(s) dW(s) (given by (11.10)) into the left-hand integral and writing it as a sum of a Riemann and Itô integral. Note that in case the process is standard Brownian motion, i.e., X(t) = W(t) with µ ≡ 0, σ ≡ 1, we simply recover $\int_{0}^{t} Y (s) d W (s)$ , the stochastic integral of Y w.r.t. Brownian motion.

11.4.3 Quadratic (Co-) Variation

An important characteristic of a stochastic process is the quadratic variation that measures the accumulated variability of the process along its path. The quadratic variation is a path-dependent quantity. Recall that for Brownian motion we derived its quadratic variation on a time interval [0, t] as [W, W](t) = t. So Brownian motion accumulates quadratic variation at rate one per unit time. This gives us a simple differential “rule”:

$d [W, W] (t) \equiv d W (t) d W (t) \equiv {(d W (t))}^{2} = d t .$

A practical way of thinking about this result is to say that a Brownian increment is of order $O ({(d t)}^{1 / 2})$ as dt → 0. We essentially already used this fact in showing the non-differentiability of Brownian paths. We also saw that, formally, the quadratic variation of a continuously differentiable function f is zero. This fact is also realized by noting that $d [f, f] (t) = {(d f (t))}^{2} = {(f^{'} (t))}^{2} {(d t)}^{2} = O ({(d t)}^{2})$ is negligible as dt → 0.

One can prove that the quadratic variation of the Itô integral $I_{X} = \int_{0}^{t} X (s) d W (s)$ is

$[I_{X}, I_{X}] (t) = \int_{0}^{t} X^{2} (s) d s, t \geq 0. (11.11)$

So, the integral IX(t) accumulates quadratic variation at the (generally random) rate of X2(t) per unit time at every time t ≥ 0. That is, in differential form (11.11) gives us the “rule”:

$d [I_{X}, I_{X}] (t) \equiv d I_{X} (t) d I_{X} (t) \equiv {(d I_{X} (t))}^{2} = X^{2} (t) d t .$

Similarly, we can define the quadratic covariation of two processes:

$[X, Y] (t) = \lim_{δ (P_{n}) \to 0} \sum_{i = 1}^{n} (X (t_{i}) - X (t_{i - 1})) (Y (t_{i}) - Y (t_{i - 1})) . (11.12)$

Clearly, the quadratic covariation is a bilinear functional. Let us consider several examples.

Let X(t) be a continuously differentiable (C1(ℝ)) function that satisfies dX(t) = µX(t) dt and let Y(t) be an Itô process. Then, [X, Y](t) = 0 for t ≥ 0. In differential form, this fact reads as dX(t) dY(t) = 0. Since Brownian motion is itself an Itô process and the function X(t) = t belongs to C1(ℝ), we have [t, W](t) = 0. This last fact is recorded in differential form as the “rule”: dt dW(t) = 0, i.e., an infinitesimal time increment times an infinitesimal Brownian increment gives zero.
Proof.

For t ≥ 0,

$\begin{array}{l} | [X, Y] (t) | \leq \lim_{δ (P_{n}) \to 0} | \sum_{i = 1}^{n} (X (t_{i}) - X (t_{i - 1})) (Y (t_{i}) - Y (t_{i - 1})) | \\ \leq \underset{= 0 a .s .}{\underset{︸}{\lim_{δ (P_{n}) \to 0} \max_{1 \leq i \leq n} | Y (t_{i}) - Y (t_{i - 1}) |}} . \underset{= V_{X}^{(1)} (t) < \infty}{\underset{︸}{\lim_{δ (P_{n}) \to 0} \sum_{i = 1}^{n} | X (t_{i}) - X (t_{i - 1}) | = 0}} \end{array}$

Here we applied the Heine–Cantor theorem, which states that a continuous function (in this case Y is a.s. continuous) on a finite interval is uniformly continuous and the fact that sample paths of X have finite first variation.
2. The covariation of two Itô processes X and Y defined by
$X (t) = X (0) + \int_{0}^{t} μ_{X} (s) d s + \int_{0}^{t} σ_{X} (s) d W (s)$

and

$Y (t) = Y (0) + \int_{0}^{t} μ_{Y} (s) d s + \int_{0}^{t} σ_{Y} (s) d W (s)$

is given by

$[X, Y] (t) = \int_{0}^{t} σ_{X} (s) σ_{Y} (s) d s . (11.13)$

A simple (heuristic) way to arrive at this result is to make recourse to the simple differential rules. In particular, the two processes satisfy

$d X (t) = μ_{X} (t) d t + σ_{X} (t) d W (t) and d Y (t) = μ_{Y} (t) d t + σ_{Y} (t) d W (t) .$

Hence, by multiplying out all terms in the differentials, we have

$\begin{array}{l} X (t) d Y (t) & = & {μ_{X} (t) d t + σ_{X} (t) d W (t)} {μ_{Y} (t) d t + σ_{Y} (t) d W (t)} \\ = & μ_{X} (t) μ_{Y} (t) {(d t)}^{2} + (μ_{X} (t) σ_{Y} (t) + μ_{Y} (t) σ_{X} (t)) d t d W (t) \\ + σ_{X} (t) σ_{Y} (t) {(d W (t))}^{2} . \end{array}$

Now, using the rules (dt)2 ≡ 0, dt dW(t) ≡ 0 and (dW(t))2 = dt gives the differential form of the quadratic covariation in (11.13):

$d [X, Y] (t) \equiv d X (t) d Y (t) = σ_{X} (t) σ_{Y} (t) d t .$

An important application of quadratic covariation is the integration by parts formula given just below. Consider a sequence of partitions {Pn}n≥1 of [0, t] (such that δ(Pn) → 0, as n → ∞) and rewrite the sum of products of increments of X and Y on the partition Pn as follows:

$\begin{array}{l} \sum_{i = 1}^{n} (X (t_{i}) - X (t_{i - 1})) (Y (t_{i}) - Y (t_{i - 1})) = \underset{= X (t) Y (t) - X (0) Y (0)}{\underset{︸}{\sum_{i = 1}^{n} (X (t_{i}) Y (t_{i}) - X (t_{i - 1}) Y (t_{i - 1}))}} \\ \underset{\to \int_{0}^{t} X (s) d Y (s), as n \to \infty}{\underset{︸}{- \sum_{i = 1}^{n} X (t_{i - 1}) (Y (t_{i}) - Y (t_{i - 1}))}} - \underset{\to \int_{0}^{t} Y (s) d X (s), as n \to \infty}{\underset{︸}{\sum_{i = 1}^{n} Y (t_{i - 1}) (X (t_{i}) - X (t_{i - 1})) .}} \end{array}$

Thus, the quadratic covariation of two Itô processes is

$[X, Y] (t) = X (t) Y (t) - X (0) Y (0) - \int_{0}^{t} X (s) d Y (s) - \int_{0}^{t} Y (s) d X (s) .$

Alternatively, we write

$X (t) Y (t) - X (0) Y (0) = \int_{0}^{t} X (s) d Y (s) + \int_{0}^{t} Y (s) d X (s) + [X, Y] (t) . (11.14)$

In differential form this gives us the important Itô product rule:

$d (X (t) Y (t)) = X (t) d Y (t) + Y (t) d X (t) + d X (t) d Y (t) . (11.15)$

The reader should observe that the stochastic differential of a product of two processes does not obey the same differential product rule as in ordinary calculus. The extra term dX(t) dY(t) = d[X, Y](t) is the product of the two differentials, which is generally nonzero. In particular, if both processes X and Y are driven by a Brownian increment dW(t) then their paths are nondifferentiable and hence the quadratic covariation [X, Y](t) is nonzero. Later we shall see that the above product rule also follows as a special case of the Itô formula derived for smooth functions of two processes.

11.5 Itô's Formula for Functions of BM and Itô Processes

11.5.1 Itô's Formula for Functions of BM

The Itô formula is a stochastic chain rule that allows us to find stochastic differentials of functions of Brownian motion as well as functions of an Itô process. The ordinary chain rule written for two differentiable functions f and g is as follows:

$\frac{d}{d t} f (g (t)) = f^{'} (g (t)) g' (t)$ (derivative form);
df(g(t)) = f'(g(t)) dg(t) (differential form);
$f (g (t)) - f (g (0)) = \int_{0}^{t} f^{'} (g (s)) d g (s)$ (integral form).

However, we cannot immediately apply this rule to f(W(t)) since Brownian motion W has nondifferentiable sample paths. Assume that f has continuous derivatives of first, second, and higher orders. Consider the Taylor series expansion for a smooth function f about the value W(t):

$\begin{array}{l} f (W (t + δ t)) - f (W (t)) & = & f^{'} (W (t)) \underset{of order (δ t) \frac{1}{2}}{\underset{︸}{(W (t + δ t) - W (t))}} + \frac{1}{2} f^{″} (W (t)) \underset{of order δ t}{\underset{︸}{{(W (t + δ t) - W (t))}^{2}}} \\ + \frac{1}{6} f ″′ (W (t)) \underset{of order (δ t)^{\frac{3}{2}}}{\underset{︸}{{(W (t + δ t) - W (t))}^{3}}} + ..., \end{array}$

where δt is a small time increment. A heuristic argument that leads us to the simplest version of the Itô formula goes as follows. In the infinitesimal limit, we take δt → dt and W(t + δt) − W(t) → dW(t), and we neglect all terms of order (δt)3/2 and smaller (of higher power than 3/2 in δt) to obtain

$d f (W (t)) = f^{'} (W (t)) d W (t) + \frac{1}{2} f^{″} (W (t)) {(d W (t))}^{2} .$

By applying the simple rule (dW(t))2 = dt, we obtain the Itô formula for f(W(t)), which can be stated in the respective differential and integral forms:

$d f (W (t)) = \frac{1}{2} f^{″} (W (t)) d t + f^{'} (W (t)) d W (t), (11.16)$

$\int_{0}^{t} d f (W (s)) ≔ f (W (t)) - f (W (0)) = \frac{1}{2} \int_{0}^{t} f^{″} (W (s)) d s + \int_{0}^{t} f^{'} (W (s)) d W (s) . (11.17)$

This formula holds for any twice continuously differentiable function f ∈ C2(ℝ). The expression (11.17) tells us that f(W) ≔ {f(W(t))}t≥0 is an Itô process.

Only the skeleton of a proof of the Itô formula (11.17) is outlined below:

Proof.

Let Pn = {ti}1≤i≤n be a partition of [0, t]. Write f(W(t)) − f(W(0)) as a telescopic sum over points of Pn:

$f (W (t)) - f (W (0)) = \sum_{i = 1}^{n} (f (W (t_{i})) - f (W (t_{i - 1}))) .$

Now, apply Taylor's expansion formula to each term of the above sum:

$f (W (t_{i})) - f (W (t_{i - 1})) = f^{'} (W (t_{i - 1})) (W (t_{i}) - W (t_{i - 1})) + \frac{1}{2} f^{″} (θ_{i}) {(W (t_{i}) - W (t_{i - 1}))}^{2},$

where θi lies between W(ti-1) and W(ti) for i = 1, 2, . . . , n. By taking limits as n → ∞ and δ(Pn) → 0, the partial sums converge (in L2(Ω)) to the respective integrals:

$\begin{array}{l} \sum_{i = 1}^{n} f^{'} (W (t_{i - 1})) (W (t_{i}) - W (t_{i - 1})) \to \int_{0}^{t} f^{'} (W (s)) d W (s) (an It \hat{o} integral), \\ \sum_{i = 1}^{n} f^{″} (θ_{i}) {(W (t_{i}) - W (t_{i - 1}))}^{2} \to \int_{0}^{t} f^{″} (W (s)) d s (a Riemann integral) . \end{array}$

Example 11.4

Find the stochastic differential df(W(t)) for functions:

(a) f(x) = xn, n ∈ ℕ;
(b) f(x) = eαx, α ∈ ℝ.

Solution.

(a) Differentiating gives f′(x) = nxn−1, f″(x) = n(n − 1)xn−2. Thus, (11.17) with f(W(t)) = Wn(t), f(W(0)) = Wn(0) = 0 reads
$W^{n} (t) = \frac{n (n - 1)}{2} \int_{0}^{t} W^{n - 2} (s) d s + n \int_{0}^{t} W^{n - 1} (s) d W (s) .$

The differential form of the above representation is

$d W^{n} (t) = \frac{n (n - 1)}{2} W^{n - 2} (t) d t + n W^{n - 1} (t) d W (t) .$

By taking n = 2, we now also recover the well-known formula of the Itô integral of Brownian motion that we derived previously:

$W^{2} (t) = \int_{0}^{t} d s + 2 \int_{0}^{t} W (s) d W (s) \Rightarrow \int_{0}^{t} W (s) d W (s) = \frac{1}{2} W^{2} (t) - \frac{t}{2} .$

For n = 3, we have

$W^{3} (t) = 3 \int_{0}^{t} W (s) d s + 3 \int_{0}^{t} W^{2} (s) d W (s) .$

Note that the Itô integral $\int_{0}^{t} W^{2} (s) d W (s)$ is a (square-integrable) martingale with the property $\int_{0}^{t} E [W^{4} (s)] d s < \infty$ . Hence, the process defined by $Y (t) ≔ W^{3} (t) - 3 \int_{0}^{t} W (s) d s, t \geq 0$ , is a martingale w.r.t. any Brownian filtration. We have proven this fact earlier in Example 11.1, but now it follows simply by applying an appropriate Itô formula and from the martingale property of the Itô integral.
(b) Differentiating gives f′(x) = αf(x) and f"(x) = α2 f(x). Denote X(t) = f(W(t)) = eαW(t). Recall that X is a geometric Brownian motion (GBM). Now, by applying the Itô formula in (11.16) we have the stochastic differential of GBM:
$d X (t) = \frac{α^{2}}{2} X (t) d t + α X (t) d W (t) .$

There are various important extensions of the Itô formula. In particular, consider the case of a stochastic process defined by X(t) ≔ f(t, W(t)), t ≥ 0, where the function f(t, x) ∈ C1, 2, i.e., we assume that the functions $f_{t} (t, x) ≔ \frac{\partial f}{\partial t} (t, x), f_{x} (t, x) ≔ \frac{\partial f}{\partial x} (t, x), f_{t x} (t, x) ≔ \frac{\partial^{2} f}{\partial t \partial x} (t, x)$ , and $f_{x x} (t, x) ≔ \frac{\partial^{2} f}{\partial x^{2}} (t, x)$ are continuous. Let us heuristically apply a Taylor expansion to the differential df(t, W(t)) = f(t + dt, W(t) + dW(t)) − f(t, W(t)) and keep only terms up to second order in the Brownian increment dW(t) and first order in the time increment dt:

$\begin{array}{l} d f (t, W (t)) & = & f_{t} (t, W (t)) d t + f_{x} (t, W (t)) d W (t) \\ + f_{t x} (t, W (t)) d t d W (t) + \frac{1}{2} f_{x x} (t, W (t)) {(d W (t))}^{2} + \dots \end{array}$

By the simple rules we have

${(d W (t))}^{2} \equiv d t, {(d t)}^{2} \equiv 0, d t d W (t) \equiv 0.$

Collecting the coefficient terms in dt and dW(t), the differential and integral forms of the Itô formula for f(t, W(t)) are then respectively given by

$d f (t, W (t)) = (f_{t} (t, W (t)) + \frac{1}{2} f_{x x} (t, W (t))) d t + f_{x} (t, W (t)) d W (t), (11.18)$

$\begin{array}{l} f (t, W (t)) - f (0, W (0)) & = & \int_{0}^{t} (f_{u} (u, W (u)) + \frac{1}{2} f_{x x} (u, W (u))) d u \\ + \int_{0}^{t} f_{x} (u, W (u)) d W (u), (11.19) \end{array}$

for all 0 ≤ t ≤ T.

Example 11.5

Find the stochastic differential of the GBM process S(t) = S0eαt+σW(t), t ≥ 0, with constants S0 > 0, α, σ ∈ ℝ.

Solution. We represent S(t) = f(t, W(t)), where f(t, x) ≔ S0eαt+σx. Hence,

$f_{t} (t, x) = α f (t, x), f_{x} (t, x) = σ f (t, x), f_{x x} (t, x) = σ^{2} f (t, x) .$

Substituting these partial derivatives into the Itô formula (11.18) gives

$\begin{array}{l} d S (t) & = & (α f (t, W (t)) + \frac{σ^{2}}{2} f (t, W (t))) d t + σ f (t, W (t)) d W (t) \\ = & (α + \frac{σ^{2}}{2}) S (t) d t + σ S (t) d W (t) . \end{array}$

Note that, in the above example, if we put α = µ — σ2/2, with parameter µ ∈ ℝ, then $S (t) = S_{0} e^{(μ - σ^{2} / 2) t + σ W (t)}$ is a GBM satisfying the SDE

$d S (t) = μ S (t) d t + σ S (t) d W (t)$

with initial condition S(0) = S0 and it is an Itô process where

$S (t) = S_{0} + μ \int_{0}^{t} S (u) + σ \int_{0}^{t} S (u) d W (u) .$

Hence, $S (t) = S_{0} e^{(μ - σ^{2} / 2) t + σ W (t)}$ , for t ≥ 0, is an explicit solution to the above stochastic integral (or differential) equation whereby S(t) is explicitly given as (an exponential) function in the BM W(t) at time t. It is a martingale iff the drift coefficient µ = 0. In fact, in the previous chapter, we already proved (using different methods) that $M (t) ≔ e^{- \frac{σ^{2}}{2} t + σ W (t)}$ is a martingale. It follows that the discounted process by e−µt S(t) = M(t), t ≥ 0, is a martingale w.r.t. a filtration for BM.

11.5.2 Itô's Formula for Itô Processes

We are now ready to extend the Itô formula to the case of a process defined in terms of a smooth function of an Itô process and time t. Consider an Itô process {X(t)}0≤t≤T with the stochastic differential

$d X (t) = μ (t) d t + σ (t) d W (t) .$

As in the previous version of the Itô formula obtained above, assume that f(t, x) ∈ C1, 2. Then, Y(t) ≔ f(t, X(t)), 0 ≤ t ≤ T, is also an Itô process. To obtain its stochastic differential we apply a Taylor expansion and keep only terms up to second order in the increment dX(t) and first order in the time increment dt:

$d Y (t) = f_{t} (t, X (t)) d t + f_{x} (t, X (t)) d X (t) + \frac{1}{2} f_{x x} (t, X (t)) {(d X (t))}^{2} .$

Note that the mixed partial derivative term ftx(t, X(t))dt dX(t) = 0 since dt dX(t) = µ(t)(dt)2 + σ(t) dt dW(t) = 0 upon using the simple rules (dt)2 = 0, dt dW(t) = 0. Also, (dX(t))2 = σ2(t)dt, and inserting the differential for dX(t) into the above equation and combining all coefficients multiplying dt and dW(t) finally gives us the Itô formula in differential form:

$\begin{array}{l} d Y (t) \equiv d f (t, X (t)) = (f_{t} (t, X (t)) + μ (t) f_{x} (t, X (t)) + \frac{1}{2} σ^{2} (t) f_{x x} (t, X (t))) d t \\ + σ (t) f_{x} (t, X (t)) d W (t) . (11.20) \end{array}$

The integral form of this is

$\begin{array}{l} f (t, X (t)) - f (0, X (0)) = \int_{0}^{t} (f_{s} (s, X (s)) + μ (s) f_{x} (s, X (s)) + \frac{1}{2} σ^{2} (s) f_{x x} (s, X (s))) d s \\ + \int_{0}^{t} σ (s) f_{x} (s, X (s)) d W (s), (11.21) \end{array}$

for 0 ≤ t ≤ T.

Note that in case Y(t) = f(X(t)), i.e., f(t, x) = f(x) is not an explicit function of the time variable, then ft(t, x) ≡ 0 and all partial derivatives are simply ordinary derivatives: fx(t, x) = f'(x), fxx(t, x) = f"(x). The differential form of the Itô formula is $d f (X (t)) = f^{'} (X (t)) d X (t) + \frac{1}{2} f^{″} (X (t)) {(d X (t))}^{2}$ , i.e.,

$d f (X (t)) = (μ (t) f^{'} (X (t)) + \frac{1}{2} σ^{2} (t) f^{″} (X (t))) d t + σ (t) f^{'} (X (t)) d W (t) (11.22)$

and in integral form

$\begin{array}{l} f (X (t)) - f (X (0)) & = & \int_{0}^{t} (μ (s) f^{'} (X (s)) + \frac{1}{2} σ^{2} (s) f^{″} (X (s))) d s \\ + \int_{0}^{t} σ (s) f^{'} (X (s)) d W (s) . (11.23) \end{array}$

Observe that (11.16) and (11.17) are recovered by (11.22) and (11.23) in the special case where the Itô process is Brownian motion: X = W where µ ≡ 0, σ ≡ 1. Similarly, (11.18) and (11.19) are special cases of (11.20) and (11.21).

Example 11.6

Let Y(t) ≔ ln X(t), t ≥ 0, where {X(t)}t≥0 is an Itô process with stochastic differential

$d X (t) = a X (t) d t + b X (t) d W (t) .$

Find the SDE for the process Y and then find explicit representations for Y(t) and X(t) in terms of W(t).

Solution. In this case we define f(x) ≔ ln x, where Y(t) = f(X(t)). Differentiating gives $f^{'} (x) = \frac{1}{x}, f^{″} (x) = - \frac{1}{x^{2}}$ . Applying (11.22) with µ(t) = aX(t), σ(t) = bX(t) gives

$\begin{array}{l} d Y (t) & = & (a X (t) \frac{1}{X (t)} + \frac{1}{2} {(b X (t))}^{2} (\frac{- 1}{X^{2} (t)})) d t + b X (t) \frac{1}{X (t)} d W (t) \\ = & (a - \frac{b^{2}}{2}) d t + b d W (t) . \end{array}$

Integrating this equation therefore shows that the process Y is a drifted Brownian motion starting at Y(0) = ln X(0):

$Y (t) = \ln X (0) + (a - \frac{b^{2}}{2}) t + b W (t) .$

By inverting the transformation, we find the original process X(t) as a closed-form expression in W(t):

$X (t) = e^{Y (t)} = e^{ln X (0) + (a - \frac{b^{2}}{2}) t + b W (t)} = X (0) e^{(a - \frac{b^{2}}{2}) t + b W (t)} .$

An Itô integral $I (t) = \int_{0}^{t} σ (s) d W (s), t \geq 0$ , is a martingale provided that the stochastic integral is well-defined). However, a time integral $\int_{0}^{t} μ (s) d s$ is generally not a martingale. Thus, the Itô formula can be used to verify whether or not a stochastic process that is a function of an Itô process is a martingale.

Example 11.7

Verify whether or not the following processes are martingales w.r.t. a filtration for BM:

(a) $X (t) = Z^{2} (t) - \int_{0}^{t} f^{2} (u) d u$ , where $Z (t) = \int_{0}^{t} f (u) d W (u)$ and f is an ordinary continuous function for t ≥ 0;
(b) $Y (t) = V^{2} (t) - \frac{t^{2}}{2}$ , where $V (t) = \int_{0}^{t} W (u) d W (u)$ .

Solution.

(a) The process Z is Gaussian with the stochastic differential dZ(t) = f(t) dW(t) = µ(t) dt + σ(t) dW(t) where µ(t) ≡ 0 and σ(t) ≡ f(t). The process X is given by X(t) = g(t, Z(t)) with $g (t, x) ≔ x^{2} - \int_{0}^{t} f^{2} (u) d u$ . Taking derivatives of g:
$g_{t} (t, x) = - f^{2} (t), g_{x} (t, x) = 2 x, g_{x x} (t, x) \equiv 2.$

Applying (11.20) gives a stochastic differential with zero drift,

$\begin{array}{l} d X (t) & = & [g_{t} (t, Z (t)) + 0 \cdot g_{x} (t, Z (t)) + \frac{1}{2} f^{2} (t) g_{x x} (t, Z (t))] d t + f (t) g_{x} (t, Z (t)) d W (t) \\ = & (- f^{2} (t) + \frac{1}{2} (2 f^{2} (t))) d t + 2 f (t) Z (t) d W (t) = 2 f (t) Z (t) d W (t) . \end{array}$

In integral form, $X (t) = 2 \int_{0}^{t} f (u) Z (u) d W (u)$ since X(0) = 0. Thus, X(t) is an Itô integral. It satisfies the square-integrability condition

$E [{\int_{0}^{t} (f (u) Z (u))}^{2} d u] = \int_{0}^{t} f^{2} (u) E[Z^{2} (u)] d u < \infty$

since $E [Z^{2} (u)] = \int_{0}^{u} f^{2} (s) d s$ is a continuous function of u ≥ 0. Hence the process X is a martingale.
(b) First, find the stochastic differential of Y:
$d Y (t) = 2 V (t) d V (t) + {(d V (t))}^{2} - t d t = (W^{2} (t) - t) d t + 2 V (t) W (t) d W (t) .$

Thus, Y (t) is a sum of an Itô integral (which is a martingale) and a Riemann integral of a function of Brownian motion:

$Y (t) = 2 \int_{0}^{t} V (u) W (u) d W (u) + \int_{0}^{t} (W^{2} (u) - u) d u .$

Note that Y (0) = 0. Let us show that the Riemann integral above is not a martingale. As a first simple check, we can try to verify whether the expected value of $I (t) ≔ \int_{0}^{t} (W^{2} (u) - u)$ du is nonconstant over time:

$E[I (t)] = \int_{0}^{t} \underset{u}{\underset{︸}{(E [W^{2} (u)]}} - u) d u = 0$

for t ≥ 0. So the expectation is constant and we cannot yet conclude whether or not the process is a martingale. We hence need to necessarily calculate the conditional expectation to verify whether the process satisfies the martingale property. For t, s > 0, we have, upon using the martingale property of {W2(t) — t}t≥0:

$\begin{array}{l} E_{t} [I (t + s)] & = & I (t) + E_{t} [\int_{t}^{t + s} (W^{2} (u) - u) d u] = I (t) + \int_{t}^{t + s} E_{t} [(W^{2} (u) - u)] d u \\ = & I (t) + \int_{t}^{t + s} (W^{2} (t) - t) d u \\ = & I (t) + (W^{2} (t) - t) \int_{t}^{t + s} d u = I (t) + s (W^{2} (t) - t) \neq I (t) . \end{array}$

In conclusion, Et [Y (t + s)] ≠ Y (t) and hence the process Y is not a martingale.

11.6 Stochastic Differential Equations

An equation of the form of an Itô stochastic differential

$d X (t) = μ (t, X (t)) d t + σ (t, X (t)) d W (t) (11.24)$

where the coefficient drift µ(t, x) and volatility σ(t, x) are given (known) functions and X(t) is the unknown process is called a stochastic differential equation (SDE). Equations of this form are of great importance in financial modelling. In practice, (11.24) is subject to an initial condition X(0) = X0 where X0 is either a random variable or simply a constant X0 = x ∈ ℝ. As was mentioned in a previous section, an SDE of the type in (11.24), with constant X0, is also called a diffusion. We will study diffusions in some depth a little later in the text.

A process X is a so-called strong solution to the SDE in (11.24) if, for all t ≥ 0 (or t ∈ [0, T] if time is restricted to some finite interval [0, T]), the process satisfies

$X (t) = X (0) + \int_{0}^{t} μ (s, X (s)) d s + \int_{0}^{t} σ (s, X (s)) d W (s)$

where both integrals are assumed to exist. The randomness is completely driven by the underlying Brownian motion. So, in case σ ≡ 0 the equation is simply an ordinary first order ODE. It is important to note that a solution X(t) is an adapted process that is some representation or functional written in terms of the Brownian motion up to time t, i.e., X(t) = F(t, {W(s); 0 ≤ s ≤ t}). We have in fact already seen some cases (see Examples 11.5 and 11.6) where the solution X(t) = F(t, W(t)) is just a function of the Brownian motion at the endpoint time t. A strong solution hence also gives a path-wise representation of the process {X(t)}t≥0. In most cases strong solutions to SDEs cannot be found explicitly, although we can still compute a number of important properties of the process. An alternative and important type of solution is a so-called weak solution, which is a solution in distribution. We now turn our attention to so-called linear SDEs, as these form the simplest class of SDEs that have some applications in finance and for which a unique strong solution can be found explicitly.

11.6.1 Solutions to Linear SDEs

A linear SDE is an equation of the form

$d X (t) = (α (t) + β (t) X (t)) d t + (γ (t) + δ (t) X (t)) d W (t) (11.25)$

where the coefficients α(t), β(t), γ(t), δ(t) are given adapted processes. These are assumed to be continuous functions of time t. We note that they can simply be ordinary (nonrandom) functions of t or may also be random but not functions of the process X(t). When α(t), β(t), γ(t), δ(t) are non-random functions of time, then the process is a diffusion with linear SDE of the form dX(t) = a(t, X(t)) dt + b(t, X(t)) dW(t), with both coefficient functions being linear in the state variable: a(t, x) = α(t) + β(t)x and b(t, x) = γ(t) + δ(t)x. The stochastic equations considered in Examples 11.5 and 11.6 are simple linear SDEs. The nice thing about an SDE of the form (11.25) is that we have explicit solutions, as we now derive.

Equation (11.25) is readily solved by first considering the simpler case when α(t) ≡ γ(t) ≡ 0. Denoting the simpler process by U, the SDE in (11.25) takes the form

$d U (t) = β (t) U (t) d t + δ (t) U (t) d W (t) . (11.26)$

This SDE is now solved by considering the logarithm of the process, Y(t) ≔ ln U(t), and applying Itô's formula (see Example 11.6):

$d Y (t) = d ln U (t) = \frac{d U (t)}{U (t)} - \frac{1}{2} {(\frac{d U (t)}{U (t)})}^{2} = (β (t) - \frac{1}{2} δ^{2} (t)) d t + δ (t) d W (t) .$

Putting this SDE in integral form and using U(t) = eY(t) gives

$U (t) = U (0) \exp [\int_{0}^{t} (β (s) - \frac{1}{2} δ^{2} (s)) d s + \int_{0}^{t} δ (s) d W (s)] . (11.27)$

This solution is compactly written as a product: $U (t) = U (0) e^{\int_{0}^{t} β (s) d s} ℰ_{t} (δ \cdot W)$ , where we denote the stochastic exponential of an adapted process {δ(s), 0 ≤ s ≤ t}, w.r.t. BM on the time interval [0, t], by

$ε_{t} (δ \cdot W) ≔ exp [- \frac{1}{2} \int_{0}^{t} δ^{2} (s) d s + \int_{0}^{t} δ (s) d W (s)] . (11.28)$

Note that, by setting β(t) ≡ 0 in (11.26), the solution to the SDE

$d U (t) = δ (t) U (t) d W (t) subject to U (0) = 1$

is the stochastic exponential in (11.28). This type of process plays an important role in derivative pricing theory so we shall revisit it in Section 11.8, as well as its multidimensional version in Section 11.10. Note that when the coefficients β(t) and δ(t) are nonrandom functions of time, and U (0) is taken as a positive constant, the Itô integral $\int_{0}^{t} δ (s) d W (s) ~ N o r m (0, \int_{0}^{t} δ^{2} (s) d s)$ , i.e., is a Gaussian process, and $In \frac{U (t)}{U (0)} ~ N o r m (μ_{t}, υ_{t})$ with mean $μ_{t} = \int_{0}^{t} (β (s) - \frac{1}{2} δ^{2} (s)) d s$ and variance $ν_{t} = \int_{0}^{t} δ^{2} (s) d s$ . That is, U(t) is a lognormal random variable and hence {U(t)}t≥0 is a GBM process. In particular, for the case of constant coefficients we recover a GBM process as in Example 11.6. In more complicated general cases where δ(t) and β(t) are random variables (for example, functionals of BM up to time t) then the exponent in (11.27) is not a normal random variable and hence the process {U(t)}t≥0 is not a GBM.

Finally, the solution X(t) for the general linear SDE in (11.25) is now readily derived based on the solution to (11.26). The trick is to write it as a product, X(t) = U(t)V(t) where U(t) is given by (11.27) and hence satisfies the SDE in (11.26), and where V(t) satisfies the SDE:

$d V (t) = [\frac{α (t) - γ (t) δ (t)}{U (t)}] d t + \frac{γ (t)}{U (t)} d W (t) (11.29)$

with initial conditions chosen as U(0) = 1 and V(0) = X(0). By the Itô product formula (11.15) X(t) = U(t)V(t) satisfies dX(t) = U(t)dV(t) + V(t)dU(t) + dU(t) dV(t). Using (11.26) and (11.29) and by the usual rules we have that X(t) satisfies the SDE in (11.25) with initial condition U(0)V(0) = X(0), i.e., X(t) solves (11.25) with arbitrary initial condition X(0). An explicit representation for V(t) is obtained simply from the integral form of (11.29) with V(0) = X(0):

$V (t) = X (0) + \int_{0}^{t} \frac{α (s) - γ (s) δ (s)}{U (s)} d s + \int_{0}^{t} \frac{γ (s)}{U (s)} d W (s) . (11.30)$

Hence, the solution to the general linear SDE (11.25) is given by X(t) = U(t)V(t) where U(t) and V(t) are respectively given by (11.27) and (11.30) with U(0) = 1. That is,

$X (t) = U (t) (X (0) + \int_{0}^{t} [α (s) - γ (s) δ (s)] U^{- 1} (s) d s + \int_{0}^{t} γ (s) U^{- 1} (s) d W (s)) (11.31)$

where $U (t) = e^{\int_{0}^{t} (β (s) - \frac{1}{2} δ^{2} (s)) d s + \int_{0}^{t} δ (s) d W (s)} = e^{\int_{0}^{t} β (s) d s} ℰ_{t} (δ \cdot W)$ and $U^{- 1} (s) \equiv 1 / U (s) = e^{- \int_{0}^{s} β (μ) d u} ℰ_{s} (δ \cdot W), 0 \leq s \leq t$ .

Example 11.8

Solve the SDE

$d X (t) = (α - β X (t)) d t + σ d W (t)$

for all t ≥ 0, subject to X(0) = x with constants x, α, β, σ ∈ ℝ.

Solution. Note that the SDE is of the form in (11.25) with constant coefficients α(t) = α, β(t) = −β, γ(t) = σ, δ(t) ≡ 0. The expression in (11.28) simplifies to ∈t(δ · W) = ∈t(0) = 1 and U(t) = e−βt, U−1(s) = eβs. Substituting into (11.31) gives the solution

$\begin{array}{l} X (t) & = & e^{- β t} (x + α \int_{0}^{t} e^{β s} d s + σ \int_{0}^{t} e^{β s} d W (s)) \\ = & e^{- β t} x + \frac{α}{β} (1 - e^{- β t}) + σ e^{- β t} \int_{0}^{t} e^{- β s} d W (s) . (11.32) \end{array}$

By letting $X (t) ≔ f (t, Y (t)), Y (t) = \int_{0}^{t} e^{β s} d W (s)$ , where $f (t, y) ≔ e^{- β t} x + \frac{α}{β} (1 - e^{- β t}) + σ e^{- β t} y$ , and applying an Itô formula, the reader should verify that X(t) satisfies the above SDE. We note that another simple alternative way to arrive at this solution (without the use of (11.25)) is to define $\tilde{X} (t) : = e^{β t} X (t)$ , which has the effect of eliminating the state variable dependence in the SDE. Indeed, by applying an Itô formula we have $d \tilde{X} (t) = α e^{β t} d t + σ e^{β t} d W (t)$ , with coefficients independent of $\tilde{X} (t)$ . Integrating, with $\tilde{X} (0) = X (0) = x$ , gives the solution to $\tilde{X} (t)$ :

$\tilde{X} (t) = x + \frac{α}{β} (e^{β t} - 1) + σ \int_{0}^{t} e^{β s} d W (s) . (11.33)$

The solution in (11.32) follows by $X (t) = e^{- β t} \tilde{X} (t)$ .

In Example 11.8, the solution represented in (11.32) is a Gaussian process involving an Itô integral that is a normal random variable, i.e., X(t) = µt + σe−βtY(t) where $μ_{t} = E [X (t)] = e^{- β t} x + \frac{α}{β} (1 - e^{- β t})$ and $Y (t) ≔ \int_{0}^{t} e^{β s} d W (s)$ is a zero-mean normal random variable:

$Y (t) ~ N o r m (0, Var(Y (t)) = N o r m (0, \int_{0}^{t} e^{2 β s} d s) = N o r m (0, \frac{1}{2 β} (e^{2 β t} - 1)) .$

Hence, $X (t) ~ N o r m (μ_{t}, σ^{2} e^{- 2 β t} Var (Y (t))) = N o r m (e^{- β t} x + \frac{α}{β} (1 - e^{- β t}), \frac{σ^{2}}{2 β} (1 - e^{- 2 β t}))$ . Applying the formulae in (11.8) to (11.32) also gives us the covariance function cX(t, v) ≔ Cov(X(t), X(v)), 0 ≤ t ≤ v:

$\begin{array}{l} c_{X} (t, v) & = & σ^{2} e^{- β (t + v)} Cov (\int_{0}^{t} e^{β s} d W (s), \int_{0}^{v} e^{β s} d W (s)) \\ = & σ^{2} e^{- β (t + v)} \int_{0}^{t} e^{2 β s} d s = \frac{σ^{2}}{2 β} e^{- β (t + v)} (e^{2 β t} - 1) = \frac{σ^{2}}{2 β} (e^{- β (v - t)} - e^{- β (v + t)}) . \end{array}$

We can also represent the solution as a functional of the Brownian motion up to time t by making use of the Itô product rule: $d (e^{β t} W (t)) = β e^{β t} W (t) d t + e^{β t} d W (t) \Rightarrow e^{β t} W (t) = β \int_{0}^{t} e^{β s} W (s) d s + \int_{0}^{t} e^{β s} d W (s) \Rightarrow \int_{0}^{t} e^{β s} d W (s) = e^{β t} W (t) - β \int_{0}^{t} e^{β s} W (s) d s$ . Putting this last expression into (11.32) gives X(t) = F(t, {W(s);0 ≤ s ≤ t}) with functional F of the Brownian path defined by

$F (t, {W (s); 0 \leq s \leq t}) ≔ e^{- β t} x + \frac{α}{β} (1 - e^{- β t}) + σ W (t) - σ β \int_{0}^{t} e^{- β (t - s)} W (s) d s . (11.34)$

In the chapter on interest rate modelling we shall see that the above process, referred to as the Vasicek model for α, β, σ > 0, is among the simplest one used to model the instantaneous (short) interest rate. A natural extension of this model is to allow the coefficients to be time dependent functions. The resulting linear SDE is explicitly solved in the following example.

Example 11.9

Solve the SDE

$d X (t) = (a (t) - b (t) X (t)) d t + σ (t) d W (t)$

subject to X(0) = x ∈ ℝ, where a(t), b(t), σ(t) are nonrandom continuous functions of time t ≥ 0.

Solution. The SDE is of the form in (11.25) with coefficient functions α(t) = a(t), β(t) ≡ −b(t), γ(t) ≡ σ(t), δ(t) = 0. Hence, ∈t(δ · W) = ∈t(0) = 1 and $U (t) = e^{- \int_{0}^{t} b (s) d s}, U^{- 1} (s) = e^{\int_{0}^{s} b (u) d u}$ . Substituting into (11.31) gives us the explicit solution

$\begin{array}{l} X (t) & = & e^{- \int_{0}^{t} b (u) d u} (x + \int_{0}^{t} e^{\int_{0}^{a} b (u) d u} a (s) d s + \int_{0}^{t} e^{\int_{0}^{a} b (u) d u} σ (s) d W (s)) \\ = & x e^{- \int_{0}^{t} b (u) d u} + \int_{0}^{t} e^{- \int_{s}^{t} b (u) d u} a (s) d s + {\int_{0}^{t} e}^{- \int_{s}^{t} b (u) d u} σ (s) d W (s) . (11.35) \end{array}$

The above process is Gaussian since the integrand in the Itô integral is an ordinary (nonrandom) function of time. By applying Itô isometry on the Itô integral in (11.35) we have that X(t) is normally distributed with mean and variance:

$E[X (t)] = x e^{- \int_{0}^{t} b (u) d u} + \int_{0}^{t} e^{- \int_{a}^{t} b (u) d u} a (s) d s, (11.36)$

$Var (X (t)) = E [{(\int_{0}^{t} e^{- \int_{a}^{t} b (u) d u} σ (s) d W (s))}^{2}] = \int_{0}^{t} e^{- 2 \int_{a}^{t} b (u) d u} σ^{2} (s) d s . (11.37)$

The covariance function for this process follows by using (11.8) on the first line in (11.35) after subtracting the mean (for t ≤ v):

$\begin{array}{l} c_{X} (t, v) & = & e^{- \int_{0}^{t} b (u) d u} e^{- \int_{0}^{v} b (u) d u} Cov (\int_{0}^{t} e^{\int_{0}^{s} b (u) d u} σ (s) d W (s), \int_{0}^{v} e^{\int_{0}^{s} b (u) d u} σ (s) d W (s)) \\ = & e^{- 2 \int_{0}^{t} b (u) d u} e^{- \int_{t}^{v} b (u) d u} \int_{0}^{t} e^{2 \int_{0}^{s} b (u) d u} σ^{2} (s) d s \\ = & e^{- 2 \int_{t}^{v} b (u) d u} \int_{0}^{t} e^{- 2 \int_{s}^{t} b (u) d u} σ^{2} (s) d s . \end{array}$

11.6.2 Existence and Uniqueness of a Strong Solution of an SDE

An important question when finding a strong solution to an SDE of the form given by (11.24) is whether such a solution exists and, if so, whether the solution is unique. The following theorem gives sufficient conditions for the existence and uniqueness of a strong solution to the SDE in (11.24). We omit the proof of this theorem as the technical details can be found in other more specialized textbooks on stochastic analysis. The conditions in the theorem are not necessary, but are rather mild sufficient conditions that guarantee the existence a unique strong solution, i.e., that there is a unique process {X(t)}t≥0 satisfying (11.24).

Theorem 11.4.

Assume the following conditions are satisfied:

The coefficient functions µ(t, x) and σ(t, x) are locally Lipschitz in x, uniformly in t. That is, for arbitrary positive constants T and N, there exists a constant K depending possibly only on T and N such that
$| μ (t, x) - μ (t, y) | + | σ (t, x) - σ (t, y) | < K | x - y |,$

whenever |x|, |y| ≤ N and 0 ≤ t ≤ T.
The coefficient functions µ (t, x) and σ(t, x) satisfy the linear growth condition in the variable x, i.e.,
$| μ (t, x) | + | σ (t, x) | \leq K (1 + | x |) .$
The initial value X(0) is independent of the Brownian motion up to arbitrary time T and has a finite second moment, i.e., X(0) is independent of $ℱ_{T}^{W} \equiv σ (W (t); 0 \leq t \leq T)$ and E[X2(0)] < ∞.

Then, the SDE in (11.24) has a unique strong solution {X(t)}t≥0 with continuous paths X(t, ω), t ≥ 0.

For a given SDE, the conditions in the above theorem can be readily checked. For example, the first (Lipschitz) condition holds if the coefficient functions have continuous first partial derivatives $\frac{\partial μ}{\partial x} (t, x)$ and $\frac{\partial σ}{\partial x} (t, x)$ . The second condition is satisfied when the coefficient functions µ(t, x) and σ(t, x) have at most a linear growth in x for large values of x and are also bounded for arbitrarily small values of x. In most cases the SDE is subject to a constant initial condition X(0) ∈ ℝ so that the above third condition is automatically satisfied. In the case of a general linear SDE we have already shown that the solution is given by (11.31). This includes cases when the coefficients α(t), β(t), γ(t), δ(t) are bounded nonrandom functions of time. In these cases, µ(t, x) and σ(t, x) are linear functions of x, for all t, and hence the conditions in the above theorem are indeed satisfied and so a unique strong solution exists. In Examples 11.8 and 11.9 we have solved the SDE and found the unique strong solution as given by (11.32) and (11.35), respectively. We note that unique strong solutions also exist when the coefficient functions in the SDE satisfy milder conditions than those listed in the above theorem. For instance, for a time-homogeneous SDE with (time-independent) drift and volatility coefficients µ(x) and σ(x), it can be shown that there exists a unique strong solution when µ(x) satisfies a Lipschitz condition, |µ(x) − µ(y)| < K|x − y|, and σ(x) satisfies a Holder condition, |σ(x) − σ(y)| < K|x − y|α, with order α ≥ 1/2, for some constant K.

11.7 The Markov Property, Feynman–Kac Formulae, and Transition CDFs and PDFs

We have already shown that Brownian motion is a Markov process. Generally, a process {X(t)}t≥0 has the Markov property if the probability of the event {X(t) ≤ y}, y ∈ ℝ, conditional on all the past information Fs (i.e., the information about the complete history of the process up to a prior time s) is the same as its probability conditional on only knowing the process endpoint value X(s), for all 0 ≤ s ≤ t. That is, the Markov property can be formally stated equivalently as

$ℙ (X (t) \leq y | ℱ_{s}) = ℙ (X (t) \leq y | X (s)) (11.38)$

or, for any Borel function h : ℝ → ℝ,

$E[h (X (t)) | ℱ_{s}] = E[h (X (t)) | X (s)], (11.39)$

for all 0 ≤ s ≤ t. Note that the specific choice of h(x) ≔ ?x≤y recovers (11.38). Hence, when the Markov property holds, conditioning on natural filtration Fs = σ(X(u); 0 ≤ u ≤ s) is equal to conditioning on σ(X(s)). In particular, for the case of a discrete-time process X(t0), X(t1), . . . , X(tn−1), X(tn), . . ., with times t0 < t1 < . . . < tn−1 < tn < . . ., the above property takes the form that is familiar in the theory of discrete-time Markov chains:

$ℙ (X (t_{m}) \leq y | X (t_{0}), X (t_{1}), ..., X (t_{n - 1}), X (t_{n})) = ℙ (X (t_{m}) \leq y | X (t_{n})) (11.40)$

for all m ≥ n, i.e., when conditioning on the values of the process for a set {ti}0≤i≤n of previous times, the only conditioning that is relevant is the value of the process at the most recent time tn. We note that this follows from (11.38) by setting s = tn, t = tm, i.e., Fs ≡ Ftn = σ(X(t0), X(t1), . . ., X(tn−1), X(tn)) and σ(X(s)) = σ(X(tn)), and then using the usual shorthand notation ℙ(A | σ(Y1,. . . , Yn)) ≡ ℙ(A | Y1, . . . , Yn) for expressing the probability of an event A conditional on a σ-algebra generated by a set of random variables Y1,. . . , Yn.

We are interested in computing conditional expectations involving functions of a process X = {X(t)}t≥0 ∈ ℝ that solves a given SDE as in (11.24) subject to some initial condition X(0) = x. In particular, we will need to compute expectations as in (11.39) for the case that the process X is Markov. Let us now fix some time T ≥ 0. Then, we shall denote the conditional expectation of a function h(X(T)), conditioned on the process having a given value X(t) = x ∈ ℝ at a time t ≤ T, by

$E_{t, x} [h (X (T))] ≔ E [h (X (T)) | X (t) = x] .$

So the subscript x, t is shorthand notation for conditioning on a given value X(t) = x of the process at time t. It should be clear that this conditional expectation is an ordinary (nonrandom) function of the ordinary variables x and t, i.e., Et, x[h(X(T))] = g(t, x) for any fixed T. Therefore, the Markov property in (11.39) is expressible as E[h(X(T)) | Ft] = Et, X(t)[h(X(T))] = g(t, X(t)), for all 0 ≤ t ≤ T. This expectation is now the random variable given by the function g(t, X(t)) of the random variable X(t) and evaluates to g(t, x) upon setting X(t) = x. Hence, if we know the conditional probability distribution of random variable X(T), given X(t) = x, then we could compute g(t, x). That is, assume the conditional probability density function (PDF) of X(T), given X(t), exists. Recall from the previous chapter that this PDF is the transition PDF for the process X. From Definition 10.2, we have

$E_{t, x} [h (X (T))] = \int_{ℝ} h (y) f_{X (T) | X (t)} (y | x) d y = \int_{ℝ} h (y) p (t, T; x, y) d y . (11.41)$

In some cases this integral can be computed analytically. Otherwise, we need to employ a numerical method. For example, if the expression for the PDF is known, then we can compute the integral using an appropriate numerical quadrature algorithm. Monte Carlo methods can generally be used to compute the above integral by sampling (i.e., simulating) the paths of the process at time T given their fixed value x at time t. Different simulation approaches may be applied. One approach is to use a time-stepping algorithm for simulating the paths according to the SDE. Alternatively, if the transition PDF is known, then we can sample the (path endpoint) value X(T) according to its distribution. For details on these techniques, we refer the reader to the chapter on Monte Carlo methods.

The next theorem tells us that solutions to an SDE are Markov processes.

Theorem 11.5.

Let {X(t)}t≥0 be a solution to the SDE in (11.24) with some given initial condition. Then,

$E [h (X (T)) | ℱ_{t}] = g (t, X (t)) (11.42)$

where g(t, x) = Et, X[h(X(T))], for all 0 ≤ t ≤ T and Borel function h.

A rigorous proof of this result is beyond the scope of this text. The important content of this result is that the expectation of any function of the process at a future time T ≥ t, conditional on the filtration (or path history) at time t, is given simply by its expectation conditional only on the path value at time t. In practice, we can apply this theorem by first computing Et, x[h(X(T))], and then putting X(t) in the place of variable x.

A simple nonrigorous, yet instructive, argument that leads us to the fact that the solution to an SDE has the Markov property is to let T = t + Δt, for a small time step Δt ≈ 0. Then, by the integral form of the SDE:

$X (t + Δ t) = X (t) + \int_{t}^{t + Δ t} μ (s, X (s)) d s + \int_{t}^{t + Δ t} σ (s, X (s)) d W (s) .$

This expresses the value of the process at future time t + Δt in terms of its value at any current time t plus an ordinary integral and an Itô integral. For small Δt, the integrals are well approximated by holding the integrand coefficient functions constant and evaluated at the left endpoint s = t of the time interval [t, t + Δt]. This gives us the approximation X(t + Δt) ≈ X(t) + µ(t, X(t))Δt + σ(t, X(t))ΔW(t), where ΔW (t) ≡ W(t + Δt) − W(t). Hence, the left-hand side of (11.38), for T = t + Δt, is approximated by

$ℙ (X (t + Δ t) \leq y | ℱ_{t}) \approx ℙ (X (t) + μ (t, X (t)) Δ t + σ (t, X (t)) Δ W (t) \leq y | ℱ_{t}) = g (X (t))$

where g(x) = ℙ(x + µ(t, x)Δt + σ(t, x)ΔW(t) ≤ y). Here we used the independence proposition where the Brownian increment ΔW(t) is independent of Ft and X(t) is Ft-measurable. We can now put back the conditioning on X(t) in the unconditional probability since ΔW (t) is independent of X(t), so the function g(x) is equally given by the conditional probability: g(x) = ℙ(x + µ(t, x)Δt + σ(t, x)ΔW(t) ≤ y | X(t)) ≈ ℙ(X(t + Δt) ≤ y | X(t)). Hence, we recover (approximately) the Markov property in (11.38), ℙ(X(t + Δt) ≤ y | Ft) ≈ ℙ(X(t + Δt) ≤ y | X(t)). In the above theorem this relation holds exactly.

By the Markov property of the process X, we also have the following martingale property for a process defined via an expectation of some function of X(T) conditional on X(t).

Proposition 11.6.

Let {X(t)}t≥0 satisfy the SDE in (11.24) subject to some initial condition. Let ϕ : ℝ → ℝ be a Borel function and define f(t, x) ≔ Et, X[ϕ(X(T))] for fixed T > 0, assuming Et, X[|ϕ(X(T))|] < ∞. Then, the stochastic process

$Y_{t} ≔ f (t, X (t)), 0 \leq t \leq T,$

is a martingale w.r.t. any filtration {Ft}t≥0 for Brownian motion.

Proof. Let 0 ≤ s ≤ t ≤ T. Note that, based on Theorem 11.5,

$E [ϕ (X (T)) | ℱ_{t}] = E_{t, X (t)} [ϕ (X (T))] = f (t, X (t)) = Y_{t}$

for any 0 ≤ t ≤ T. Using this relation and applying the tower property gives the martingale expectation property:

$E [Y_{t} | ℱ_{s}] = E [E [ϕ (X (T)) | ℱ_{t}] | ℱ_{s}] = E [ϕ (X (T)) | ℱ_{s}] = f (s, X (s)) = Y_{s} .$

Based on the Markov property of any solution to an SDE, we are now ready to discuss the very important connection that exists between an SDE and a PDE. In what follows we will find it very convenient to make use of the differential operator G defined by

$G_{t, x} f (t, x) ≔ \frac{1}{2} σ^{2} (t, x) \frac{\partial^{2} f}{\partial x^{2}} (t, x) + μ (t, x) \frac{\partial f}{\partial x} (t, x) . (11.43)$

This differential operator in the variables (t, x) is the so-called generator for the process X and it acts on all functions f ∈ C1, 2, i.e., having continuous partial derivatives $\frac{\partial f}{\partial t}, \frac{\partial f}{\partial x}$ and $\frac{\partial^{2} f}{\partial x^{2}} .$ Using this operator we can now rewrite the differential and integral forms of the Itô formula in (11.20) and (11.21) as

$d f (t, X (t)) = (\frac{\partial}{\partial t} + G_{t, x}) f (t, X (t)) d t + σ (t, X (t)) \frac{\partial f}{\partial x} (t, X (t)) d W (t) . (11.44)$

and

$f (t, X (t)) = f (0, X (0)) + \int_{0}^{t} (\frac{\partial}{\partial s} + G_{s, x}) f (s, X (s)) d s + \int_{0}^{t} σ (s, X (s)) \frac{\partial f}{\partial x} (s, X (s)) d W (s) . (11.45)$

Fix a time T > 0. If we assume that the Itô integral in (11.45) is a martingale, for 0 ≤ t ≤ T, whereby the square integrability condition holds, i.e., assuming

$\int_{0}^{T} E [{(σ (s, X (s)) \frac{\partial f}{\partial x} (s, X (s)))}^{2}] d s < \infty, (11.46)$

we have a useful representation of a martingale Markov process. In particular, the process {Mf(t), 0 ≤ t ≤ T} defined by

$M_{f} (t) ≔ f (t, X (t)) - \int_{0}^{t} (\frac{\partial}{\partial s} + G_{s, x}) f (s, X (s)) d s (11.47)$

is a martingale. To see this, first note that

$E [\int_{t}^{T} σ (s, X (s)) \frac{\partial f}{\partial x} (s, X (s)) d W (s) | ℱ_{t}] = 0,$

since the Itô integral is a martingale. Using the Itô formula (11.45) for times t and T:

$f (T, X (T)) = f (t, X (t)) + \int_{t}^{T} (\frac{\partial}{\partial s} + G_{s, x}) f (s, X (s)) d s + \int_{t}^{T} σ (s, X (s)) \frac{\partial f}{\partial x} (s, X (s)) d W (s) .$

Taking expectations, conditional on Ft, on both sides of this equation while using the above (zero expectation) relation gives

$E [f (T, X (T)) | ℱ_{t}] = f (t, X (t)) + E [\int_{t}^{T} (\frac{\partial}{\partial s} + G_{s, x}) f (s, X (s)) d s | ℱ_{t}] .$

Now, writing the integral appearing inside the last expectation as the difference

$\int_{0}^{T} (\frac{\partial}{\partial s} + G_{s, x}) f (s, X (s)) d s - \int_{0}^{t} (\frac{\partial}{\partial s} + G_{s, x}) f (s, X (s)) d s$

and using the fact that the [0, t]-integral is Ft-measurable and rearranging terms we obtain

$\begin{array}{l} E [f (T, X (T)) - \int_{0}^{T} (\frac{\partial}{\partial s} + G_{s, x}) f (s, X (s)) d s | ℱ_{t}] \\ = f (t, X (t)) - \int_{0}^{t} (\frac{\partial}{\partial s} + G_{s, x}) f (s, X (s)) d s . \end{array}$

By the definition in (11.47) we therefore have shown the martingale property:

$E [M_{f} (T) | ℱ_{t}] = M_{f} (t), 0 \leq t \leq T . (11.48)$

One important consequence of this martingale property is the following theorem, which shows that a solution to certain parabolic PDEs (which we will later see are closely related to the Black–Scholes PDE) can be represented as a conditional expectation.

Theorem 11.7

(Feynman–Kac). Given a fixed T > 0, let {X(t)}t≥0 satisfy the SDE in (11.24) and let ϕ : ℝ → ℝ be a Borel function. Moreover, assume the square integrability condition (11.46) holds. Let f(t, x) be a C1, 2 function solving the PDE $\frac{\partial f}{\partial t} + G_{t, x} f = 0,$ i.e.,

$\frac{\partial f}{\partial t} (t, x) + \frac{1}{2} σ^{2} (t, x) \frac{\partial^{2} f}{\partial x^{2}} (t, x) + μ (t, x) \frac{\partial f}{\partial x} (t, x) = 0, (11.49)$

for all x, 0 < t < T, subject to the condition f(T, x) = ϕ(x). Then, assuming that Et, x[|ϕ(X(T))|] < ∞, f(t, x) has the representation

$f (t, x) = E_{t, x} [ϕ (X (T))] \equiv E [ϕ (X (T)) | X (t) = x] (11.50)$

for all x, 0 ≤ t ≤ T.

We note that if ϕ(x) is continuous, then f(T−, x) ≡ limt↗T f(t, x) = f(T, x) = ϕ(x), i.e., we have continuity of the solution at t = T, for all x.

Proof. Assuming the square integrability condition (11.46), then according to the above discussion we have that the process defined in (11.47) satisfies the martingale property in (11.48). Now, let f satisfy the PDE in (11.49). This implies that the integral in (11.47) vanishes since the integrand function is identically zero, i.e., $\frac{\partial}{\partial s} f (s, x) + G_{s, x} f (s, x) = 0$ , for all s > 0 and all values of x. Hence, the process Mf(t) : = f(t, X(t)), 0 ≤ t ≤ T, is a martingale. Combining this with the Markov property of the process, and finally substituting the terminal condition for the random variable f(T, X(T)) = ϕ(X(T)), we have

$f (t, X (t)) = E [f (T, X (T)) | ℱ_{t}] = E_{t, X (t)} [f (T, X (T))] = E_{t, X (t)} [ϕ (X (T))]$

so that f(t, x) = Et, x [ϕ(X(T))] for all x, 0 ≤ t ≤ T.

This theorem hence shows that the solution at current time t, given by (11.50), is in the form of a conditional expectation of the random variable ϕ(X(T)), where ϕ is the given boundary value function and X(T) is the random variable corresponding to the endpoint value of the process at future (terminal) time T, where the process solves the SDE in (11.24) subject to it having current time-t value X(t) = x. From our previous discussion surrounding (11.41), we see that this theorem gives us a probabilistic representation of the solution to the parabolic PDE in (11.49) subject to the (terminal time) boundary value function f(T, x) = ϕ(x). Alternatively, the theorem can be used in the opposite sense; that is, it provides a PDE approach for evaluating a conditional expectation of the form in (11.50).

We now consider the simplest example of how this theorem is used in the case where the underlying Itô process is just standard Brownian motion. In particular, we solve the simple heat equation on the real line and thereby obtain a probabilistic representation of the solution as an expectation involving the endpoint value of Brownian motion.

Example 11.10

Solve the boundary value problem:

$\frac{\partial f}{\partial t} + \frac{1}{2} \frac{\partial^{2} f}{\partial x^{2}} = 0,$

for x ∈ ℝ, 0 ≤ t ≤ T, subject to f(x, T) = ϕ(x) where ϕ is an arbitrary function. Give the explicit solution for ϕ(x) = x2.

Solution. Observe that this PDE is of the same form as in (11.49) with coefficient functions σ(t, x) = 1, µ(t, x) = 0. The corresponding SDE in (11.24) is then

$d X (t) = d W (t) .$

This trivial linear SDE has the solution X(t) = x0 + W(t). As seen below, the end result does not depend on x0 since X(T) − X(t) = W(T) − W(t). Using (11.50) and the fact that W(T) − W(t) and W(t) are independent, the solution takes the equivalent forms

$\begin{array}{l} f (t, x) & = & E [ϕ (X (T)) | X (t) = x] = E [ϕ (W (T) - W (t) + X (t)) | X (t) = x] \\ = & E [ϕ (W (T) - W (t) + x)] \\ = & E [ϕ (W (T)) | W (t) = x] \\ = & \int_{- \infty}^{\infty} ϕ (y) p (t, T; x, y) d y, for t < T, \end{array}$

where $p (t, T; x, y) ≔ \frac{e^{- \frac{{(y - x)}^{2}}{2 (T - t)}}}{\sqrt{2 π (T - t)}}$ is the transition PDF of standard Brownian motion. The last line follows immediately from (10.13). Note that, for t = T, the boundary value condition is satisfied where f(T, x) = E[ϕ(W(T) − W(T) + x)] = E[ϕ(x)] = ϕ(x).

For ϕ(x) = x2, we simply have

$\begin{array}{l} f (t, x) & = & E [{(W (T) - W (t) + x)}^{2}] \\ = & E [{(W (T) - W (t))}^{2}] + 2 x E [W (T) - W (t)] + x^{2} = (T - t) + x^{2}, \end{array}$

for all 0 ≤ t ≤ T. Note that the terminal condition is satisfied, f(T, x) = x2, and this function satisfies the above PDE since $\frac{\partial f}{\partial t} + \frac{1}{2} \frac{\partial^{2} f}{\partial x^{2}} = - 1 + \frac{1}{2} (2) = 0.$

We note that the square integrability condition in Theorem 11.7 can be shown to hold in the above example. Moreover, f(t, x) is a C1, 2 function since p(t, T; x, y) is a C1, 2 function in the (t, x) variables for t < T. In the case ϕ(x) = x2, we see that this follows trivially. More generally, assuming an arbitrary ϕ function such that the above y-integral exists, we can verify that the above integral represents a solution to the PDE. Indeed, the corresponding linear differential operator in (11.43) is now $G_{t, x} f ≔ \frac{1}{2} \frac{\partial^{2} f}{\partial x^{2}},$ and reversing the order of differentiation and integration (w.r.t. the y variable) we have, for all t < T:

$(\frac{\partial}{\partial t} + G_{t, x}) f (t, x) = \int_{- \infty}^{\infty} ϕ (y) (\frac{\partial}{\partial t} + G_{t, x}) p (t, T; x, y) d y = 0,$

since (as can be verified explicitly and directly) the above PDF p = p(t, T; x, y) solves the PDE $\frac{\partial p}{\partial t} + G_{t, x} p = 0$ . For continuous ϕ(x), the solution is also continuous w.r.t. time t ∈ [0, T] where f(T−, x) = f(T, x) = ϕ(x). This is the case as the transition PDF approaches the Dirac delta function centred at zero, denoted by δ(·), as t ↗ T, i.e.,

$p (T -, T; x, y) \equiv \lim_{t ↗ T} p (t, T; x, y) = \lim_{T ↘ 0} \frac{e^{- \frac{{(y - x)}^{2}}{2 T}}}{\sqrt{2 π T}} = δ (x - y) .$

Here we have used one representation of the Dirac delta function as the limit of an infinites-imally narrow Gaussian PDF. The Dirac delta function is even, δ(x − y) = δ(y − x), and has the defining (sifting) property:

$f (T -, x) = \int_{ℝ} p (T -, T; x, y) ϕ (y) d y = \int_{ℝ} δ (x - y) ϕ (y) d y = ϕ (x),$

for any function ϕ that is continuous at x. The above delta function terminal condition is a general property of any transition PDF, as shown in Proposition 11.8 below. The above sifting property arises naturally by the Dirac measure defined in (9.29). Viewed as a distribution over ℝ, the only outcome that occurs with probability one is the single point x. The Dirac delta function can then be related to the Dirac (singular) measure δx(y) for a given point x ∈ ℝ, dδx(y) = δ(y − x)dy. Formally, the Dirac delta function δ(x) is also related to the Heaviside unit step function H(x), where H(x) equals 1 for x > 0, equals 0 for x < 0 and equals 1/2 at x = 0. In particular, its derivative is the delta function, H'(x) = δ(x). As a function of y, we write the differential dH(y − x) = H'(y − x) dy = δ(y − x) dy. Hence, when considered as a Riemann–Stieltjes integral with integrator H(y − x), a function ϕ(y) that is continuous at the point y = x will exhibit the sifting property:

$\int_{I} ϕ (y) δ (y - x) d y \equiv \int_{I} ϕ (y) d H (y - x) = ϕ (x)$

if x is in any interval I ⊂ ℝ and the integral is zero if x ∈ I. Note that when I = ℝ the integral equals ϕ(x). The reader will note that exactly the same property is satisfied if we use the unit indicator function ?{y≥x as integrator in the place of H(y − x). The two are equivalent for all y ≢ x and they are used as alternate definitions of the unit step function.

We can now use Theorem 11.7 to obtain the backward Kolmogorov PDE (in the so-called backward-time variables t, x) that is solved by any transition CDF, P(t, T; x, y) ≔ ℙ(X(T) ≤ y | X(t) = x), and hence, its corresponding transition PDF for a diffusion process with SDE in (11.24).

Proposition 11.8.

Assume the square-integrability condition in Theorem 11.7 holds. Then, a transition PDF, p = p(t, T; x, y), for the process {X(t)}t≥ with the generator in (11.43) solves the backward Kolmogorov PDE:

$(\frac{\partial}{\partial t} + G_{t, x}) p = 0, (11.51)$

where $\lim_{t ↗ T} p (t, T; x, y) \equiv p (T -, T; x, y) = δ (x - y)$ .

Proof. We begin by writing the transition probability function as a conditional expectation:

$P (t, T; x, y) = ℙ (X (T) \leq y | X (t) = x) = E [I_{{X (T) \leq y}} | X (t) = x] .$

By the above Feynman–Kac theorem, then P (for fixed T, y) solves the PDE

$(\frac{\partial}{\partial t} + G_{t, x}) P (t, T; x, y) = 0,$

with terminal condition P(T, T; x, y) = ℙ(y ≥ x) = ?{y≥x} ≡ ϕ (x). Taking partial derivatives w.r.t. y on both sides of the above PDE, and using the fact that the order of the differential operators $(\frac{\partial}{\partial t} + G_{t, x})$ and $\frac{\partial}{\partial y}$ can be reversed, gives

$(\frac{\partial}{\partial t} + G_{t, x}) \frac{\partial}{\partial y} P (t, T; x, y) = 0.$

This is exactly (11.51) since the transition PDF $p (t, T; x, y) ≔ \frac{\partial}{\partial y} P (t, T; x, y)$ . The delta function terminal condition is seen to arise as follows, since the transition function approaches the unit step function as t ↗ T,

$\begin{array}{l} d P (T -, T; x, y) & = & d H (y - x) = δ (y - x) d y \\ = & p (T -, T; x, y) d y . \end{array}$

We remark that any function p (or P) that is a solution to the backward Kolmogorov PDE and is a conditional density (or distribution) function of some Markov process, as a diffusion or Itô process, is a transition PDF (or CDF). In fact, a transition PDF p = p(t, T; x, y) is called a fundamental solution to the Kolmogorov PDE in (11.51) and its defining properties are that: (i) p is nonnegative, jointly continuous in the variables t, T; x, y, twice continuously differentiable in the spatial variables and continuously differentiable in the time variables; (ii) for any bounded Borel function ϕ then the function defined by u(t, x) : = ∫ℝ ϕ(y)p(t, T; x, y) dy is bounded and also satisfies the same Kolmogorov PDE; (iii) for continuous ϕ, limt↗T u(t, x) = u(T−, x) = ϕ(x) for all x. Property (iii) is equivalent to the Dirac delta function limit, limt↗T p(t, T; x, y) = p(T−, T; x, y) = δ(x − y). Hence, generally, if given a transition PDF p, the conditional expectation in (11.50), i.e.,

$f (t, x) = \int_{ℝ} ϕ (y) p (t, T; x, y) d y, (11.52)$

solves the backward Kolmogorov PDE in (11.49) with terminal condition f(T, x) = ϕ(x). We showed this specifically for the simple case of Brownian motion in the above example.

Example 11.11

Consider a GBM process {S(t)}t≥0 ∈ ℝ+ with SDE

$d S (t) = μ S (t) d t + σ S (t) d W (t),$

where µ, σ > 0 are constants.

(a) Provide the corresponding backward Kolmogorov PDE and obtain the transition CDF and PDF.
(b) Solve the PDE
$\frac{\partial f}{\partial t} + \frac{1}{2} σ^{2} x^{2} \frac{\partial^{2} f}{\partial x^{2}} + μ x \frac{\partial f}{\partial x} = 0,$

for x > 0, t ≤ T, subject to f(T, x) = ϕ(x) where ϕ is an arbitrary function. Give the explicit solution for ϕ(x) = x?{x>a}, with constant a > 0.

Solution.

(a) The drift and diffusion coefficient functions are time independent linear functions: µ(t, x) = µx and σ(t, x) = σx. According to (11.43), the generator Gt, x = Gx is the differential operator
$G_{x} ≔ \frac{1}{2} σ^{2} x^{2} \frac{\partial^{2}}{\partial x^{2}} + μ x \frac{\partial}{\partial x} .$

The transition CDF P = P(t, T; x, y) hence solves the PDE in (11.51):

$\frac{\partial P}{\partial t} + \frac{1}{2} σ^{2} x^{2} \frac{\partial^{2} P}{\partial x^{2}} + μ x \frac{\partial P}{\partial x} = 0,$

for all t < T, x, y > 0 with P(T, T; x, y) = ?{x≤y}. The transition CDF is given by the conditional expectation:

$P (t, T; x, y) = ℙ (S (T) \leq y | S (t) = x) = E [I_{{S (T) \leq y}} | S (t) = x] .$

We have already computed this in the previous chapter by substituting the strong solution for GBM in the form

$S (T) = S (t) e^{(μ - \frac{1}{2} σ^{2}) (T - t) + σ (W (T) - W (t))},$

giving

$\begin{array}{l} P (t, T; x, y) & = & E [I_{{\ln (S (T) / y) \leq 0}} | S (t) = x] \\ = & ℙ (\frac{W (T) - W (t)}{\sqrt{T - t}} \leq - \frac{\ln (x / y) + (μ - \frac{1}{2} σ^{2}) (T - t)}{σ \sqrt{T - t}}) \\ = & N (\frac{\ln (y / x) - (μ - \frac{1}{2} σ^{2}) (T - t)}{σ \sqrt{T - t}}) . (11.53) \end{array}$

Here we used the fact that W(T) − W(t) is independent of W(t), and hence independent of S(t), where $W (T) - W (t) \underline{\underline{d}} \sqrt{T - t} Z, Z ~$ Norm(0, 1). Differentiating the above CDF with respect to y gives the known lognormal density (see (10.27) with the drift replacement $μ \to μ - \frac{1}{2} σ^{2}) :$

$p (t, T; x, y) = \frac{1}{y σ \sqrt{2 π (T - t)}} \exp {(- \frac{[\ln (y / x) - (μ - \frac{1}{2} σ^{2}) (T - t)}{2 σ^{2} (T - t)})}^{2}, (11.54)$

for all x, y > 0, t < T, and zero otherwise. The reader can verify that the transition CDF in (11.53) has limit P(T−, T; x, y) = H(y − x).
(b) The solution f(t, x) can be obtained by the Feynman–Kac Theorem 11.7 or, alternatively, directly from (11.52). Let's solve for f(t, x) using both equivalent approaches. In the first approach, we use the above strong solution to the SDE. By (11.50), and the independence of W(T) − W(t) and S(t), we have
$\begin{array}{l} f (t, x) & = & E [ϕ (S (T)) | S (t) = x] = E [ϕ (S (t) e^{(μ - \frac{1}{2} σ^{2}) (T - t) + σ (W (T) - W (t))}) | S (t) = x] \\ = & E [ϕ (x e^{(μ - \frac{1}{2} σ 2) (T - t) + σ (W (T) - W (t))})] \\ = & E [ϕ (x e^{(μ - \frac{1}{2} σ^{2}) (T - t) + σ \sqrt{T - t Z}})] \\ = & \int_{- \infty}^{\infty} ϕ (x e^{(μ - \frac{1}{2} σ^{2}) (T - t) + σ \sqrt{T - t} z}) n (z) d z . \end{array}$

This integral, assuming it exists, represents the solution to the PDE for arbitrary function ϕ. In particular, for ϕ(y) = y?{y>a} = y?{ln(y/a)>0} we have

$\begin{array}{l} f (t, x) & = & x e^{(μ - \frac{1}{2} σ^{2}) (T - t)} E [e^{σ \sqrt{T - t} Z} I_{{\ln (x / a) + (μ - \frac{1}{2} σ^{2}) (T - t) + σ \sqrt{T - t} Z > 0}}] \\ = & x e^{(μ - \frac{1}{2} σ^{2}) (T - t)} E [e^{σ \sqrt{T - t} Z} I_{{Z > A}}] \end{array}$

with constant $A \equiv - \frac{\ln (x / a) + (μ - \frac{1}{2} σ^{2}) (T - t)}{σ \sqrt{T - t}}$ , for all t < T. For t = T, we simply have f(T, x) = x?{x>a}. This expectation is evaluated (see identity (A.1)) using $E [e^{B Z} I_{{Z > A}}] = e^{B^{2} / 2} N (B - A)$ , with constant $B \equiv σ \sqrt{T - t}$ , giving

$\begin{array}{l} f (t, x) & = & x e^{(μ - \frac{1}{2} σ^{2}) (T - t)} \cdot e^{\frac{1}{2} σ^{2} (T - t)} N (σ \sqrt{T - t} + \frac{\ln (x / a) + (μ - \frac{1}{2} σ^{2}) (T - t)}{σ \sqrt{T - t}}) \\ = & x e^{μ (T - t)} N (\frac{\ln (x / a) + (μ + \frac{1}{2} σ^{2}) (T - t)}{σ \sqrt{T - t}}) . \end{array}$

The reader can check that this expression solves the above PDE by computing the partial derivatives $\frac{\partial f}{\partial t}, \frac{\partial^{2} f}{\partial x^{2}}, \frac{\partial f}{\partial x} .$ Moreover, in the limit t ↗ T (defining τ = T − t):

$f (T -, x) = \lim_{T ↘ 0} x e^{μ T} N (\frac{\ln (x / a) + (μ - \frac{1}{2} σ^{2}) T}{σ \sqrt{T}}) = \lim_{T ↘ 0} x N (\frac{\ln (x / a)}{σ \sqrt{T}}) = x H (x - a) .$

[Note that this equals f(T, x) = ϕ(x) = x?{x>a} for all x, except at the point of discontinuity x = a of ϕ(x), i.e., ϕ(a) = 0 and f(T−, a) = aH(0) = a/2.]

In the second approach we use (11.52) and insert the above transition PDF to obtain

$\begin{array}{l} f (t, x) & = \int_{0}^{\infty} ϕ (y) p (t, T; x, y) d y = \frac{1}{σ \sqrt{2 π (T - t)}} \int_{0}^{\infty} ϕ (y) e^{- \frac{1}{2} {[\frac{\ln (y / x) - μ (T - t)}{σ \sqrt{T - t}}]}^{2} \frac{d y}{y}} \\ (let z = \frac{\ln (y / x) - μ (T - t)}{σ \sqrt{T - t}}, y = x e^{(μ - \frac{1}{2} σ^{2}) (T - t) + σ \sqrt{T - t} z}, \frac{d y}{y} = σ \sqrt{T - t} d z) \\ = \int_{0}^{\infty} ϕ (x e^{(μ - \frac{1}{2} σ^{2}) (T - t) + σ \sqrt{T - t} z}) \frac{e^{- \frac{1}{2} z^{2}}}{\sqrt{2 π}} d z . \end{array}$

As required, this produces exactly the same solution as we have above by the first approach. Of course, we should not be surprised by this fact since, by definition, p(t, T; x, y) is the conditional density of S(T) at y, given S(t) = x, and hence $f (t, x) = E [ϕ (S (T)) | S (t) = x] = \int_{0}^{\infty} ϕ (y) p (t, T; x, y) d y$ .

In the next example we obtain the transition CDF/PDF for the GBM process with time-dependent drift and diffusion coefficients.

Example 11.12

Consider a GBM process {S(t)}t≥0 ∈ ℝ+ with SDE

$d S (t) = μ (t) S (t) d t + σ (t) S (t) d W (t), (11.55)$

where µ(t), σ(t) > 0 are continuous (ordinary) functions of time t ≥ 0. State the corresponding backward Kolmogorov PDE and obtain the transition CDF and PDF.

Solution. We have a linear SDE with coefficient functions µ(t, x) = µ(t)x and σ(t, x) = σ(t)x. The corresponding generator is the differential operator

$G_{t, x} ≔ \frac{1}{2} σ^{2} (t) x^{2} \frac{\partial^{2}}{\partial x^{2}} + μ (t) x \frac{\partial}{\partial x} .$

The transition CDF P = P(t, T; x, y) solves the PDE in (11.51):

$\frac{\partial P}{\partial t} + \frac{1}{2} σ^{2} (t) x^{2} \frac{\partial^{2} P}{\partial x^{2}} + μ (t) x \frac{\partial P}{\partial x} = 0,$

for all t < T, x, y > 0 with P(T, T; x, y) = ?{x≤y}}. By the Feynman–Kac Theorem 11.7, the transition CDF is obtained by evaluating the conditional expectation

$P (t, T; x, y) = ℙ (S (T) \leq y | S (t) = x) = ℙ (X (T) \leq \ln y | X (t) = \ln x),$

where we define the process X(t) ≔ ln S(t), t ≥ 0. The SDE (11.55) has unique strong solution given by (11.31) with α(t) ≡ γ(t) ≡ 0, β(t) ≡ µ(t), δ(t) = σ(t), i.e.,

$S (t) = S (0) e^{\int_{0}^{t} μ (s) d s - \frac{1}{2} \int_{0}^{t} σ^{2} (s) d s + \int_{0}^{t} σ (s) d W (s)},$

hence

$S (T) = S (t) e^{\int_{t}^{T} μ (s) d s - \frac{1}{2} \int_{t}^{T} σ^{2} (s) d s + \int_{t}^{T} σ (s) d W (s)}$

and

$X (T) = X (t) + \int_{t}^{T} μ (s) d s - \frac{1}{2} \int_{t}^{T} σ^{2} (s) d s + \int_{t}^{T} σ (s) d W (s) .$

It is convenient to define the time-averaged drift and volatility functions:

$\bar{μ} (t, T) ≔ \frac{1}{T - t} \int_{t}^{T} μ (s) d s, \bar{σ} (t, T) ≔ \sqrt{\frac{1}{T - t} \int_{t}^{T} σ^{2} (s) d s} . (11.56)$

Since $\int_{t}^{T} σ (s) d W (s) \underline{\underline{d}} W ({\bar{σ}}^{2} (t, T) (T - t)) \underline{\underline{d}} \bar{σ} (t, T) \sqrt{T - t} Z, Z ~$ Norm(0, 1),

$X {(T)}_{=}^{d} X (t) + [\bar{μ} (t, T) - \frac{1}{2} {\bar{σ}}^{2} (t, T)] (T - t) + \bar{σ} (t, T) \sqrt{T - t} Z,$

where X(T) − X(t), and hence Z, is independent of X(t). Combining these facts into the above gives:

$\begin{array}{l} P (t, T; x, y) & = & ℙ (X (T) - X (t) \leq \ln y - X (t) | X (t) = \ln x) = ℙ (X (T) - X (t) \leq \ln (y / x)) \\ = & ℙ ([\bar{μ} (t, T) - \frac{1}{2} {\bar{σ}}^{2} (t, T)] (T - t) + \bar{σ} (t, T) \sqrt{T - t} Z \leq \ln (y / x)) \\ = & N (\frac{\ln (y / x) - [\bar{μ} (t, T) - \frac{1}{2} {\bar{σ}}^{2} (t, T)] (T - t)}{\bar{σ} (t, T) \sqrt{T - t}}) . (11.57) \end{array}$

We note that this is the form of the transition CDF for standard GBM in Example 11.11, wherein the drift and volatility coefficients in (11.53) are now replaced by the time-averaged ones: $μ \to \bar{μ} (t, T)$ and $σ \to \bar{σ} (t, T)$ . Observe, however, that the CDF in (11.57) is not a function of only T − t, i.e., the GBM process with time-dependent coefficients is an example of a time-inhomogeneous process (i.e., not time-homogeneous as in Example 11.11). Differentiating (11.53) with respect to y gives the lognormal density (analogous to 11.54):

$p (t, T; x, y) = \frac{1}{y \bar{σ} \sqrt{2 π (T - t)}} \exp (- \frac{{[\ln (y / x) - (\bar{μ} - \frac{1}{2} {\bar{σ}}^{2}) (T - t)]}^{2}}{2 {\bar{σ}}^{2} (T - t)}), (11.58)$

for all x, y > 0, t < T, and zero otherwise, where $\bar{μ} \equiv \bar{μ} (t, T), \bar{σ} \equiv \bar{σ} (t, T)$ .

Example 11.13

Consider the process {X(t)}t≥0 ∈ ℝ in Example 11.8, i.e.,

$d X (t) = (α - β X (t)) d t + σ d W (t) . (11.59)$

For α = 0, this process is specifically called the Ornstein–Uhlenbeck process or OU process for short. Derive the corresponding transition CDF and PDF.

Solution. From the analysis in Example 11.8, we have the strong solution for X(T) in terms of X(t) = x, which we can write in equivalent forms using time-changed BM:

$\begin{array}{l} X (T) & = & e^{- β (T - t)} x + \frac{α}{β} (1 - e^{- β (T - t)}) + σ \int_{t}^{T} e^{- β (T - s)} d W (s) \\ \overset{d}{=} & e^{- β (T - t)} x + \frac{α}{β} (1 - e^{- β (T - t)}) + σ W (\frac{1 - e^{- 2 β (T - t)}}{2 β}) \\ \overset{d}{=} & e^{- β (T - t)} x + \frac{α}{β} (1 - e^{- β (T - t)}) + σ e^{- β (T - t)} W (\frac{e^{2 β (T - t)} - 1}{2 β}) \\ \overset{d}{=} & e^{- β (T - t)} x + \frac{α}{β} (1 - e^{- β (T - t)}) + σ \sqrt{\frac{1 - e^{- 2 β (T - t)}}{2 β}} Z . \end{array}$

The last line displays X(T) as a normal random variable, where Z ~ Norm(0, 1) and independent of X(t). Hence, the transition CDF is a normal CDF:

$P (t, T; x, y) = ℙ (X (T) \leq y | X (t) = x) = N (\frac{y - [e^{- β (T - t)} x + \frac{α}{β} (1 - e^{- β (T - t)})]}{σ \sqrt{(1 - e^{- 2 β (T - t)}) / 2 β}}), (11.60)$

and the transition PDF is the Gaussian function

$p (t, T; x, y) = \frac{1}{σ} \sqrt{\frac{2 β}{1 - e^{- 2 β (T - t)}}} n (\frac{y - [e^{} x + \frac{α}{β} (1 - e^{- β (T - t)})]}{σ \sqrt{(1 - e^{- 2 β (T - t)}) / 2 β}}), (11.61)$

for all x, y ∈ ℝ, t < T.

Note that a transition PDF p (or CDF P) solving a given Kolmogorov PDE as in (11.51), subject to p(T−, T; x, y) = δ(x − y), is in general cases not necessarily a unique solution. This is the case even if we require p (or P) to be a PDF (or CDF). If a diffusion has one or both of its endpoints (left or right endpoint) as a regular boundary, then the behaviour of the process at the endpoint can be specified differently. An example of this is the specification of a regular reflecting boundary versus a regular killing (absorbing) boundary as in the case of Brownian motion (BM) that is either reflected or killed at an upper or lower finite boundary point. The known transition PDFs for both respective cases are of course different, yet both solve the same Kolmogorov PDE for Brownian motion and have limit p(T−, T; x, y) = δ(x − y). The key point is that the Kolmogorov PDE and terminal time condition make no mention of the boundary conditions imposed on the solution as a function of the spatial variable x. To obtain a unique fundamental solution that corresponds to a transition PDF (assuming of course that such a solution exists) one generally needs to also specify the spatial boundary conditions at both endpoints of the process. In some cases, such as for BM or drifted BM on ℝ, both endpoints ±∞ of the process are natural boundaries (not regular) and there is then a unique transition PDF on ℝ, i.e., the Gaussian PDF we have already derived. Similarly, for GBM the two endpoints of the state space (0, ∞) are natural boundaries and hence the process has a unique transition PDF on ℝ+, i.e., the known lognormal PDF.

The following result extends Theorem 11.7 and, as we shall see in later chapters, is used for pricing (single-asset) financial derivatives via a PDE based approach.

Theorem 11.9

(“Discounted” Feynman–Kac). Fix T > 0 and let {X(t)}t≥0 satisfy the SDE in (11.24). Let the same assumptions stated in Theorem (11.7) hold and assume r(t,x): [0, T] × ℝ → ℝ is a lower-bounded continuous function. Then, the function defined by the conditional expectation

$f (t, x) ≔ E_{t, x} [e^{- \int_{t}^{T} r (u, X (u)) d u} ϕ (X (T))] \equiv E [e^{-} ϕ (X (T)) | X (t) = x] (11.62)$

solves the PDE $\frac{\partial f}{\partial t} + G_{t, x} f - r (t, x) f = 0, i . e .,$

$\frac{\partial f}{\partial t} (t, x) + \frac{1}{2} σ^{2} (t, x) \frac{\partial^{2} f}{\partial x^{2}} (t, x) + μ (t, x) \frac{\partial f}{\partial x} (t, x) - r (t, x) f (t, x) = 0, (11.63)$

for all x, 0 < t < T, subject to the terminal condition f(T,x) = ϕ(x).

Proof. This result follows by first rewriting the exponential factor as

$e^{- \int_{t}^{T} r (u, X (u)) d u} = e^{- \int_{0}^{T} r (u, X (u)) d u} \cdot e^{\int_{0}^{t} r (u, X (u)) d u} .$

The process defined by $g_{t} ≔ e^{- \int_{0}^{t} r (u, X (u)) d u} f (t, X (t))$ is a martingale since gt = E[gT | Ft], where $g_{T} = e^{- \int_{0}^{T} r (u, X (u)) d u} f (T, X (T)) = e^{- \int_{0}^{T} r (u, X (u)) d u} ϕ (X (T)) :$

$\begin{array}{l} g_{t} = e^{- \int_{0}^{t} r (u, X (u)) d u} f (t, X (t)) = E_{t, X (t)} [e^{- \int_{0}^{T} r (u, X (u)) d u} ϕ (X (T))] \\ = E [e^{- \int_{0}^{T} r (u, X (u)) d u} ϕ (X (T)) | ℱ_{t}] . \end{array}$

Note that gT is FT-measurable and assumed integrable. The last step consists of computing the stochastic differential of gt via the Itô product formula. To do so, define I(t)≔ $\int_{0}^{t} r (u, X (u))$ du giving dI(t) = r(t, X(t)) dt, (dI(t))2 ≡ 0 and

$d [e^{- \int_{0}^{t} r (u, X (u)) d u}] = {de}^{- I (t)} = - e^{- I (t)} d I (t) = - e^{- \int_{0}^{t} r (u, X (u)) d u} r (t, X (t)) d t .$

Hence, using this and (11.44) within the Itô product formula gives

$\begin{array}{l} d g_{t} & = & e^{- I (t)} \cdot d f (t, X (t)) + f (t, X (t)) \cdot {de}^{- I (t)} + {de}^{- I (t)} \cdot d f (t, X (t)) \\ = & e^{- I (t)} \cdot d f (t, X (t)) + f (t, X (t)) \cdot {de}^{- I (t)} \\ = & e^{- I (t)} [d f (t, X (t)) - f (t, X (t) d I (t)] \\ = & e^{- I (t)} [(\frac{\partial}{\partial t} f (t, X (t)) + G_{t, x} f (t, X (t)) - r (t, X (t)) f (t, X (t))) d t \\ + σ (t, X (t)) \frac{\partial f}{\partial x} (t, X (t)) d W (t)] . \end{array}$

By the martingale condition, the drift coefficient (i.e., the expression multiplying dt) must vanish for all values X(t) = x and time t; namely, $(\frac{\partial}{\partial t} + G_{t, x} - r (t, x)) f (t, x)$ = 0. This is precisely the PDE in (11.63). Finally, the terminal condition follows trivially from (11.62) for $t = T : f (T, x) = E [e^{- \int_{T}^{T} r (u, X (u)) d u} ϕ (X (T)) | X (T) = x] = E [ϕ (X (T)) | X (T) = x] = ϕ (x)$ .

An important special case is when r(t,x) = r is a constant. Then, the function defined by the conditional expectation, $f (t, x) : = e^{- r (T - t)} E_{t, x} [ϕ (X (T))]$ , solves

$\frac{\partial f}{\partial t} (t, x) + \frac{1}{2} σ^{2} (t, x) \frac{\partial^{2} f}{\partial x^{2}} (t, x) + μ (t, x) \frac{\partial f}{\partial x} (t, x) - r f (t, x) = 0, (11.64)$

with terminal condition f(T,x) = ϕ(x).

11.7.1 Forward Kolmogorov PDE

Proposition 11.8 states that a transition PDF solves the Kolmogorov PDE (11.51) in the backward variables (t,x). We can also define the differential operator $\tilde{G} \equiv {\tilde{G}}_{T, y}$ acting on the forward variables (T,y):

$\tilde{G} f (T, y) ≔ \frac{1}{2} \frac{\partial^{2}}{\partial y^{2}} (σ^{2} (T, y) f (T, y)) - \frac{\partial}{\partial y} (μ (T, y) f (T, y)) . (11.65)$

This is also referred to as the differential adjoint to the generator $G$ . It can be shown that under fairly general conditions the transition PDF p = p(t,T; x,y) (considered as a function of T, y, for any fixed t, x) satisfies the so-called forward Kolmogorov or Fokker–Planck PDE

$\frac{\partial p}{\partial T} = \tilde{G} p, (11.66)$

with limT↓t p(t,T;x,y) = δ(y − x). The name forward derives from the fact that y refers to the value of the process at future time T > t.

The formal proof of (11.66), and under what conditions it holds true, requires a rather technical discussion that is beyond our scope. However, it is instructive to see how (11.66) arises from the backward PDE. Let the interval ℐ denote the state space of process X. For example, ℐ = ℝ for standard BM, ℐ = ℝ+ for GBM, ℐ = (L, ∞) for GBM killed at a lower level L > 0, etc. In our heuristic justification of (11.66) we shall now make use of the Chapman–Kolmogorov relation:

$p (t, T; x, y) = \int_{ℐ} p (t, t'; x, x') p (t', T; x', y) d x' (11.67)$

for any $t < t' < T, x, y \in ℐ$ . Equation (11.67) is an important general property that follows from the Markov property. To derive this relation, consider the joint PDF of the triplet (X(T),X(t'),X(t)) and applying conditioning gives

$f_{X (T), X (t'), X (t)} (y, x', x) = f_{X (t)} (x) \cdot f_{X (t') | X (t)} (x' | x) \cdot f_{X (T) | X (t')} (y | x')$

where $f_{X (T) | X (t'), X (t)} (y | x', x) = f_{X (T) | X (t')} (y | x')$ by the Markov property. Dividing both sides by the PDF of X(t), fx(t)(x), and using the definition of the transition PDF in (10.16), gives the joint PDF of the pair X(T), X(t') conditional on X(t) = x:

$f_{X (T), X (t') | X (t)} (y, x' | x) \equiv \frac{f_{X (T), X (t'), X (t)} (y, x', x)}{f_{X (t)} (x)} = p (t, t'; x, x') p (t', T; x', y) .$

Integrating out the x' variable gives the PDF of X(T) conditional on X(t) = x:

$f_{X (T) | X (t)} (y | x) = \int_{ℐ} f_{X (T), X (t') | X (t)} (y, x' | x) d x' = \int_{ℐ} p (t, t'; x, x') p (t', T; x', y) d x' .$

By definition, fx(T)X(t)(yx) = p(t,T;x,y) and therefore we obtain (11.67).

To arrive at (11.66) we begin by differentiating both sides of (11.67) w.r.t. t′ and note that $\frac{\partial}{\partial t'} p (t, T; x, y) \equiv 0$ , giving

$\int_{ℐ} [p (t', T; x', y) \frac{\partial}{\partial t'} p (t, t'; x, x') + p (t, t'; x, x') \frac{\partial}{\partial t'} p (t', T; x', y)] d x' \equiv 0. (11.68)$

We leave the first integral term as is, but re-express the second part of the integral by using the backward PDE, $\frac{\partial}{\partial t'} p (t', T; x', y) = - G_{t', x'} p (t', T; x', y),$ to obtain

$\int_{ℐ} p (t, t'; x, x') \frac{\partial}{\partial t'} p (t', T; x', y) d x' = - \int_{ℐ} p (t, t'; x, x') G_{t', x'} p (t', T; x', y) d x' .$

The next step consists of using the differential operator $G_{t', x^{'}}$ , applying integration by parts on the above right-hand integral and assuming that contributions from the boundaries of ℐ vanish (see Exercise 11.35) to obtain

$\int_{ℐ} p (t, t'; x, x') G_{t', x'} p (t', T; x',) d x' = \int_{ℐ} p (t', T; x' y) {\tilde{G}}_{t', x'} p (t, t'; x, x') d x' . (11.69)$

This shows that $\tilde{G}$ indeed acts as the corresponding adjoint operator to $G$ . Using this relation into the second term in the integrand of (11.68) gives

$\int_{ℐ} p (t', T; x', y) [\frac{\partial}{\partial t'} p (t, t'; x, x') - {\tilde{G}}_{t' x'} p (t, t'; x, x')] d x' \equiv 0. (11.70)$

Since this integral is identically zero for arbitrary given values $t' > t, x^{'} \in ℐ$ , then (assuming a large enough family of positive transition PDFs p(t',T;x',y) as functions of x') the integrand must be zero for all $x^{'} \in ℐ$ . This implies that the term in brackets in the integrand must equal zero, i.e., for fixed backward variables t, x we have the forward Kolmogorov PDE, $\frac{\partial p}{\partial t'} = {\tilde{G}}_{t', x'} p$ , in the forward variables t',x' for an arbitrary transition PDF p = p(t,t';x,x').

11.7.2 Transition CDF/PDF for Time-Homogeneous Diffusions

In many applications, including derivative pricing, the stochastic process is assumed to be time-homogeneous. We recall the definition of a time-homogeneous process from the previous chapter, i.e., the relation in (10.17). For a time-homogeneous diffusion process, this means that the drift and diffusion coefficient functions are only functions of the "spatial variable" and are not functions of time t: μ(x,t) = μ(x) and σ(x,t) = σ(x). The generator $G_{t, x} \equiv G_{x}$ for such a process is then of the form

$G_{x} ≔ \frac{1}{2} σ^{2} (x) \frac{\partial^{2}}{\partial x^{2}} + μ (x) \frac{\partial}{\partial x} . (11.71)$

Since the transition PDF (or CDF) satisfies a time-homogeneous PDE, it is then a function of the time difference: τ ≡ T − t, i.e., we write it as p(τx,y) and the transition CDF as P(τx,y). This time dependence on τ = T − t can also be realized from the conditional expectation definition of the transition CDF. Indeed, the defining relation in (10.17) implies

$\begin{array}{l} P (t, T; x, y) \equiv ℙ (X (T) \leq y | X (t) = x) & = & ℙ (X (t + τ) \leq y | X (t) = x) \\ = & ℙ (X (τ) \leq y | X (0) = x) \\ = & P (0, τ; x, y) \equiv P (τ; x, y), \end{array}$

and $p (τ; x, y) = \frac{\partial}{\partial y} P (τ; x, y)$ Writing p(t,T;x,y) = p(τ; x,y) and using the fact that $\frac{\partial τ}{\partial T} = 1$ and $\frac{\partial τ}{\partial t} = - 1$ gives

$\frac{\partial p (t; T; x, y)}{\partial T} = \frac{\partial p (τ; x, y)}{\partial τ} and \frac{\partial p (t, T; x, y)}{\partial t} = - \frac{\partial p (τ; x, y)}{\partial τ} .$

The backward and forward Kolmogorov PDEs are then given by

$\frac{\partial p}{\partial τ} = \frac{1}{2} σ^{2} (x) \frac{\partial^{2} p}{\partial x^{2}} + μ (x) \frac{\partial p}{\partial x} (backward) (11.72)$

$\frac{\partial p}{\partial τ} = \frac{1}{2} \frac{\partial^{2}}{\partial y^{2}} (σ^{2} (y) p) - \frac{\partial}{\partial y} (μ (y) p) (forward) (11.73)$

for a transition PDF p = p(τ;x,y) and the same PDEs for the corresponding CDF P(τ;x,y). The previous terminal condition is now an initial condition where

$\lim_{τ ↘ 0} p (T, x, y) \equiv p (0 +, x, y) = δ (x - y) and P (0, x, y) = I_{{x \leq y}} . (11.74)$

Note that, by time homogeneity, the conditional expectation in (11.50) in the above Feynman–Kac Theorem gives

$E [ϕ (X (T)) | X (t) = x] = E [ϕ (X (t + τ)) | X (t) = x] = E [ϕ (X (τ)) | X (0) = x] ≔ f (τ, x) .$

That is, (11.52) now reads

$f (τ, x) = \int_{ℝ} ϕ (y) p (τ; x, y) d y, (11.75)$

where ƒ solves the backward Kolmogorov PDE

$\frac{\partial f}{\partial τ} = \frac{1}{2} σ^{2} (x) \frac{\partial^{2} f}{\partial x^{2}} + μ (x) \frac{\partial f}{\partial x} (11.76)$

with initial condition ƒ(0,x) = ϕ(x). Note: ƒ(0+,x) ≡ ƒ(0,x) for continuous ϕ(x).

Assuming a constant discount function r(t,x) = r, we observe that the discounted expectation is also a function of variables τ, x, i.e., we have the function

$ν (τ, x) = e^{- r (T - t)} E_{t, x} [ϕ (X (T))] = e^{- r τ} f (T, x) = e^{- r τ} \int_{ℝ} ϕ (y) p (τ; x, y) d y$

satisfying the PDE

$\frac{\partial ν}{\partial τ} = \frac{1}{2} σ^{2} (x) \frac{\partial^{2} ν}{\partial x^{2}} + μ (x) \frac{\partial v}{\partial x} - r ν (11.77)$

with initial condition v(0,x) = ϕ(x). This is the time-homogeneous version of (11.64).

We have already seen several specific examples of time-homogeneous processes such as standard BM, GBM in Example 11.11, and the OU process in Example 11.13. In Example 11.11, the GBM process is time homogeneous with coefficient functions μ(x) = μx and σ(x) = σx, and having respective transition CDF and PDF:

$P (τ; x, y) = N (\frac{\ln (y / x) - (μ - \frac{1}{2} σ^{2}) τ}{σ \sqrt{τ}}) (11.78)$

and

$p (τ; x, y) = \frac{1}{y σ \sqrt{2 π τ}} \exp (- {\frac{[\ln (y / x) - (μ - \frac{1}{2} σ^{2}) τ]}{2 σ^{2} τ}}^{2}), (11.79)$

x,y > 0,τ > 0. The reader can verify by direct differentiation that both functions satisfy (11.72) and (11.73) with the appropriate initial condition in (11.74). This is also the case for the time-homogeneous OU process, where setting τ = T − t in (11.60) and (11.61) gives the transition CDF and PDF that satisfy the above time-homogeneous Kolmogorov PDEs with μ(x) = α − βx, σ(x) = σ. In contrast, for nonconstant μ(t) and (or) nonconstant σ(t), the GBM process in Example 11.11 is time inhomogeneous , i.e., the transition functions in (11.57) and (11.58) cannot be written as functions of only τ = T − t in the time variables, but rather depend on both t and T, separately, via the time-averaged quantities in (11.56).

11.8 Radon–Nikodym Derivative Process and Girsanov's Theorem

Our main goal in this section is to use and build upon the basic tools and ideas developed in Section 9.5 of Chapter 9 in order to understand how to construct a certain type of probability measure change which introduces a drift in the BM. In particular, we are interested in a measure change, say $ℙ \to \hat{ℙ},$ whereby we begin with W ≔ {W(t)}t≥0 as a standard BM under measure ℙ and then define a new process, which we denote by $\hat{W} ≔ {\hat{W} (t)}_{t \geq 0,}$ such that $\hat{W}$ is a standard BM under the new measure $\hat{ℙ}$ . We will see that the measure change from $ℙ \to \hat{ℙ}$ is constructed by using a positive random variable that is an exponential ℙ-martingale and that there is a precise relationship between the two Brownian motions W and $\hat{W}$ which will differ only by a drift component. This is the essence of Girsanov's Theorem, whose statement and proof are given later. The change of measure has many useful applications and will also allow us to compute conditional expectations of processes or random variables that are functionals of Brownian motion under two different probability measures ℙ and $\hat{ℙ}$ . These two measures will be equivalent in the sense that (as we recall from our previous discussion on equivalent probability measures) all events having zero probability under one measure also have zero probability under the other measure.

Let us begin by fixing a filtered probability space (Ω, ℱ, ℙ, ?), where ? = {Ft}t≥0 is any filtration for standard Brownian motion and recall Definition 10.1 of a (ℙ,Fopf;)-BM. This is shorthand for a standard Brownian motion W≔ {W(t)}t≥0 w.r.t. a given filtration ? and a measure ℙ. That is, W has (a.s.) continuous paths started at W(0) = 0 and, under given measure ℙ, it has normally distributed increments W(t) − W(s) ~ Norm(0,t − s) that are independent of ℱs, for all 0 ≤ s < t. From these original defining properties we then showed that W has quadratic variation [W,W](t) = t and that it is a (ℙ, ?)-martingale (shorthand for a martingale w.r.t. filtration ? and measure ℙ). Now, let's assume that we have a process that we know is a continuous martingale started at zero and with the same quadratic variation formula as standard Brownian motion. The question is whether or not this process is a standard Brownian motion. It turns out that the answer is yes, the process is a standard Brownian motion as stated in the following theorem, originally due to Lévy. In what follows, this will give us a useful way to recognize when a martingale process is in fact a standard Brownian motion. The characterization makes no assumption of the normality and independence of increments! Rather, these properties are implied. Besides the martingale property, the requirement of continuity of all paths and the fact that they must start at zero, the recognition that we have a BM follows from the assumption of the above quadratic variation formula.

[Technical Remark: We note that the proof of Theorem 11.10 below makes use of a general version of the Itô formula in (11.18). Although we do not prove it, it turns out that we have the same Itô formula as in (11.18) if W is replaced by a continuous martingale process M ≔ {M(t)}t≥0, that starts at zero and has quadratic variation [M,M](t) = t, i.e., dM(t)dM(t) = d[M,M](t) = dt:

$d f (t, M (t)) = (f_{t} (t, M (t)) + \frac{1}{2} f_{x x} (t, M (t))) d t + f_{x} (t, M (t)) d M (t) . (11.80)$

Essentially one can think of this as the Itô formula in (11.20) where X ≡ M with zero drift

μ ≡ 0 and unit diffusion function σ ≡ 1. The integral form of (11.80) is (see (11.19))

$f (t, M (t)) = f (0, M (0)) + \int_{0}^{t} (f_{u} (u, M (u)) + \frac{1}{2} f_{x x} (u, M (u))) d u + \int_{0}^{t} f_{x} (u, M (u)) d M (u) . (11.81)$

Assuming the usual square integrability condition as we did for any Itô integral w.r.t. BM, the above stochastic integral w.r.t. the increment dM(u) is defined in a similar fashion and is a martingale having zero expected value. Note that if ƒ(t,x) is a C1,2 function satisfying the PDE $f_{t} (t, x) + \frac{1}{2} f_{x x} (t, x) = 0,$ then the process defined by Y(t) ≔ ƒ(t,M(t)) is a martingale.]

Theorem 11.10

(Lévy’s characterization of standard BM). Let the process {M(t)}t≥0 be a continuous (ℙ,?)-martingale started at M(0) = 0 (a.s.) and with quadratic variation [M,M](t) = t for all t ≤ 0. Then, {M(t)}t≥0 is a standard (ℙ,?)-BM.

Proof. Since we have already assumed that M has continuous paths all starting at zero, from the definition of a (ℙ,?)-BM we have left to show that M(t) − M(s) ~ Norm(0, t − s) and that these increments are independent of ℱs for all 0 ≤ s < t. For this purpose, consider the function $f (t, x) = e^{- \frac{1}{2} θ^{2} t + θ x}$ with arbitrary real parameter θ. Since ƒ(t,x) satisfies the PDE $f_{t} (t, x) + \frac{1}{2} f_{x x} (t, x) = 0,$ from the above discussion we have that the process $f (t, M (t)) = e^{- \frac{1}{2} θ^{2} t + θ M (t)}, t \geq 0$ , is a martingale. In fact, we recognize this as an example of an exponential martingale. Taking the expectation of the process at time t, conditional on filtration ℱs, s ≤ t, and using the martingale property gives the conditional moment-generating function (m.g.f.) of M(t) − M(s) as a function of θ:

$E [e^{- \frac{1}{2} θ^{2} t + θ M (t)} | ℱ_{s}] = e^{- \frac{1}{2} θ^{2} s + θ M (s)} \Rightarrow E [e^{θ (M (t) - M (s))} | ℱ_{s}] = e^{\frac{1}{2} θ^{2} (t - s)} .$

This is equivalent to the m.g.f. of M(t) − M(s), which is the m.g.f. of a Norm(0,t − s) random variable (by the tower property):

$E [e^{θ (M (t) - M (s))}] = E [E [e^{θ (M (t) - M (s))} | ℱ_{s}]] = e^{\frac{1}{2} θ^{2} (t - s)} .$

Hence, as function of θ, the m.g.f. and the m.g.f. conditional on ℱs are the same and correspond to that of a Norm(0, t − s) random variable, i.e., M(t) − M(s) ~ Norm(0,t − s) and M(t) − M(s) is independent of ℱs, for all s ≤ t.

[Remark: In what follows, we will only distinguish between different probability measures while fixing a filtration ? for BM. Hence, we shall also write ℙ-BM to mean standard (ℙ,?)-BM, i.e., standard BM w.r.t. filtration ? and under measure ℙ. Equivalently, we shall also say that W is a BM under the measure ℙ. We shall also sometimes loosely say BM (or Brownian motion) where we clearly really mean standard BM. Also, we simply say ℙ-martingale to mean a (ℙ,?)-martingale and $\hat{ℙ}$ -martingale to mean a $(\hat{ℙ}, F)$ -martingale when ? is fixed.]

In what follows we let the probability measure $\hat{ℙ} \equiv {\hat{ℙ}}^{(ϱ)}$ be defined by (9.109) of Section 9.5, i.e., $\hat{ℙ} (A) ≔ \int_{A} ϱ (ω) d ℙ (ω), A \in ℱ,$ with Radon–Nikodym random variable $ϱ \equiv \frac{d \hat{ℙ}}{d ℙ}$ assumed positive (almost surely) with unit expectation under measure ℙ, E[ϱ] = 1. We recall how ϱ is used in (9.110) and (9.111) for computing the expectation of any integrable random variable under measures $\hat{ℙ}$ and ℙ, respectively. Shortly we shall explicitly specify this random variable and, in fact, its precise specification is a key ingredient in Girsanov’s Theorem. However, for the moment we can keep our assumptions on ϱ as is (which are as general as possible). In preparation for our main result, we will need to define and discuss some basic properties of a so-called Radon–Nikodym derivative process of $\hat{ℙ}$ w.r.t. ℙ. In a previous chapter on discrete-time financial models, we defined a similar process but in a discrete-time stochastic setting. In continuous time we shall fix some terminal time T > 0 and define the Radon–Nikodym derivative process {ϱt}0≤t≤T (of measure $\hat{ℙ} \equiv {\hat{ℙ}}^{(ϱ)}$ w.r.t. measure ℙ for a given filtration ?) by

$ϱ t ≔ E [ϱ | ℱ_{t}], 0 \leq t \leq T . (11.82)$

We remark that it is customary to also use the following more explicit equivalent notations for the random variable ϱt:

$ϱ t \equiv {(\frac{d \hat{ℙ}}{d ℙ})}_{t} \overset{or}{\equiv} {(\frac{d {\hat{ℙ}}^{(ϱ)}}{d ℙ})}_{t} \overset{or}{\equiv} {(\frac{d {\hat{ℙ}}^{(ϱ)}}{d ℙ})}_{ℱ t} .$

Hence (11.82) is also written as ${(\frac{d \hat{ℙ}}{d ℙ})}_{t} ≔ E [\frac{d \hat{ℙ}}{d ℙ} | ℱ_{t}]$ . These notations really spell out the definition in (11.82) and also visually remind us of the "direction of the measure change," e.g., $ℙ \to \hat{ℙ}$ . In what follows we shall try to keep our notation less cumbersome as long as there is no ambiguity.

Clearly ϱt is ℱt-measurable and integrable, E[|ϱt|] ≤ E[|ϱ|] = E[ϱ] = 1 < ∞, for all $t \in [0, T]$ . By the tower property and the definition in (11.82), we immediately we see that the process {ϱt}0≤t≤T is a ℙ-martingale (recall the Doob-Léevy martingale):

$E [ϱ_{t} | ℱ_{s}] = E [E [ϱ | ℱ_{t}] | ℱ_{s}] = E [ϱ | ℱ_{s}] = ϱ_{s}, 0 \leq s \leq t \leq T .$

By definition, the process also starts with unit value: ϱ0 = E[ϱ| ℱ0] ≡ E[ϱ] = 1. Hence, by the martingale property, the process has unit expectation, E[ϱt] = ϱ0 = 1, for all $t \in [0, T]$ .

The next proposition gives a useful formula for computing the $\hat{ℙ}$ -measure expectation of an ℱt-measurable random variable X, conditional on information up to a time s prior to time t, as a ℙ-measure conditional expectation of X · (ϱt/ϱs). The ratio ϱt/ϱs of the Radon–Nikodym derivative process at times s and t adjusts for the change of measure in the conditional expectation.

Proposition 11.11.

Let $\hat{ℙ}$ be defined by $\hat{ℙ} (A) ≔ \int_{A} ϱ (ω) d ℙ (ω), A \in ℱ,$ with process ϱt ≔ E[ϱ| ℱt], 0 ≤ t ≤ T. Assume the random variable X is integrable w.r.t. $\hat{ℙ}$ and ℱt-measurable for a given time $t \in [0, T]$ . Then, for all 0 ≤ s ≤ t,

$\hat{E} [X | ℱ_{s}] = ϱ_{s} E [ϱ_{t} X | ℱ_{s}] . (11.83)$

Proof. This result follows as a simple application of Theorem 9.7 where we set $G \equiv ℱ_{s}$ , and $ℱ_{s} \subset ℱ_{t} \subset ℱ$ implies $G \subset ℱ$ . Then, upon using the definition in (11.82) for time s, the formula in (9.113) gives

$\hat{E} [X | ℱ_{s}] = \frac{E [ϱ X | ℱ_{s}]}{E [ϱ | ℱ_{s}]} = ϱ_{s} E [ϱ X | ℱ_{s}] .$

The last expectation on the right is now recast by reversing the tower property, by conditioning on ℱt, and using the fact that X is Ft-measurable (so it is pulled out of the inner expectation conditional on ℱt below):

$E [ϱ X | ℱ_{s}] = E [E [ϱ X | ℱ_{t}] | ℱ_{s}] = E [X E [ϱ | ℱ_{t}] | ℱ_{s}] = E [X ϱ_{t} | ℱ_{s}] .$

In the last step we used the definition E[ϱ | ℱt] = ϱt.

Note that a special case of (11.83) is when s = 0. Since $ϱ_{0} = 1, \hat{E} [X | ℱ_{0}] = \hat{E} [X]$ and E[ϱt X | ℱ0] = E[ϱt X], we have

$\hat{E} [X] = E [ϱ_{t} X] for ℱ_{t} -measurable X . (11.84)$

Consider a continuous-time stochastic process {X(t)}t≥0 adapted to the filtration ?. Since X(t) is ℱt-measurable for every t ≥ 0, we may put X = X(t) in (11.83) to obtain

$\hat{E} [X (t) | ℱ_{s}] = ϱ_{s} E [ϱ_{t} X (t) | ℱ_{s}], 0 \leq s \leq t \leq T . (11.85)$

As a consequence of this property we have the following result.

Proposition 11.12.

A continuous-time adapted stochastic process {M(t)}0≤t≤T is a $\hat{ℙ}$ -martingale if and only if {ϱtM(t)}0≤t≤T is a $\hat{ℙ}$ -martingale.

Proof. Assume {M(t)}0≤t≤T is a $\hat{ℙ}$ -martingale. Then, using (11.85) with X(t) ≡ M(t),

$M (s) = \hat{E} [M (t) | ℱ_{s}] = ϱ_{s} E [ϱ_{t} M (t) | ℱ_{s}] \Rightarrow ϱ_{s} M (s) = E [ϱ_{t} M (t) | ℱ_{s}]$

for 0 ≤ s ≤ t ≤ T, where the last relation is the ℙ-martingale property of {ϱtM(t)}0≤t≤T. The converse follows since all the above steps may be reversed. Moreover, {M(t)}0≤t≤T is adapted to ? and integrable w.r.t. $\hat{ℙ}$ if and only if {ϱtM(t)}0≤t≤T is adapted to ? and integrable w.r.t. ℙ.

We are now finally ready to state and prove Girsanov’s Theorem for the case of standard Brownian motion.

Theorem 11.13

(Girsanov’s Theorem for BM). Let {W(t)}0≤t≤T be a standard ℙ-BM w.r.t. a filtration ? = {Ft}0≤t≤T and assume the process {γ(t)}0≤t≤T is adapted to ?, for a given T > 0. Define

$ϱ_{t} ≔ \exp (- \frac{1}{2} \int_{0}^{t} γ^{2} (s) d s + \int_{0}^{t} γ (s) d W (s)), 0 \leq t \leq T, (11.86)$

and the probability measure $\hat{ℙ} \equiv {\hat{ℙ}}^{(ϱ)}$ by the Radon–Nikodym derivative $\frac{d \hat{ℙ}}{d ℙ} = {(\frac{d \hat{ℙ}}{d ℙ})}_{T} \equiv ϱ_{T}$ . Furthermore, assume the square-integrability condition holds:

$E [\int_{0}^{T} ϱ_{s} γ^{2} (s) d s] < \infty . (11.87)$

Then, the process ${\hat{W} (t)}_{0 \leq t \leq T}$ defined by

$\hat{W} (t) ≔ W (t) - \int_{0}^{t} γ (s) d s (11.88)$

is a standard $\hat{ℙ}$ -BM w.r.t. filtration ?.

Some clarifying remarks on Theorem 11.13 before its proof:

The condition in (11.87) is required to ensure that {ϱt}0≤t≤T is a ℙ-martingale with E[ϱt] = 1, i.e., this corresponds to the Itô process $\int_{0}^{t} ϱ_{s} γ (s) d W (s), 0 \leq t \leq T,$ being a martingale. An equivalent and more practically verified condition that guarantees the process {ϱt}0≤t≤T is a ℙ-martingale is the so-called Novikov condition:
$E [exp (\frac{1}{2} \int_{0}^{T} γ^{2} (s) d s)] < \infty . (11.89)$
The differential increments of the two Brownian motions are simply related: $d W (t) = d \hat{W} (t) + γ (t) d t and d \hat{W} (t) = d W (t) - γ (t) d t .$
Pay attention to the consistent and correct use of the ± signs. In this regard, we note that the Radon–Nikodym derivative random variable in (11.86) can equivalently be written as
$ϱ_{t} = \exp (- \frac{1}{2} \int_{0}^{t} θ^{2} (s) d s - \int_{0}^{t} θ (s) d W (s)) .$

Note the − sign instead of the + sign in front of the Itô integral. Then, (11.88) is replaced by $\hat{W} (t) ≔ W (t) + \int_{0}^{t} θ (s) d s, i .e ., d \hat{W} (t) = d W (t) + θ (t) d t .$ This is obtained simply by setting γ(t) = −θ(t) in the original definition where γ2(t) = θ2(t).
In general, γ is an adapted process so that ϱt is a functional of BM from time 0 to t. In particular, the Radon–Nikodym derivative process has the form of an exponential ℙ-martingale in the process γ w.r.t. the ℙ-BM, i.e., by the definition in (11.28) we have $ϱ_{t} \equiv ϱ_{t}^{(γ)} = ε_{t} (γ \cdot W)$ . Dividing the process value at any two times 0 ≤ s < t ≤ T gives
$\frac{ϱ_{t}}{ϱ_{s}} \equiv \frac{(\frac{d \hat{ℙ}}{d ℙ}) t}{(\frac{d \hat{ℙ}}{d ℙ}) s} = \frac{ℰ_{t} (γ \cdot W)}{ℰ_{s} (γ \cdot W)} = \exp (- \frac{1}{2} \int_{s}^{t} γ^{2} (u) d u + \int_{s}^{t} γ (u) d W (u)) .$
Note that ? is any filtration for BM. It can, but need not be, the natural filtration ?W generated by W.
In the simplest case we can choose a constant process, γ(t) = γ = constant, where
$ϱ_{t} \equiv ϱ_{t} = e^{- \frac{1}{2} γ^{2} t + γ W (t)} (11.90)$

and $\hat{W} (t) \equiv {\hat{W}}^{(γ)} (t) ≔ W (t) - γ t, 0 \leq t \leq T, is a \hat{ℙ}$ -BM.

Proof. First let us verify that {ϱt}0≤t≤T is a Radon–Nikodym derivative process. By the assumption in (11.87) (or the Novikov condition) we have that {ϱt}0≤t≤T is a ℙ-martingale; in fact it is an exponential ℙ-martingale. This can be seen by applying Itô's formula to the stochastic exponential in (11.86), giving

$d ϱ_{t} = ϱ_{t} γ (t) d W (t) \Rightarrow ϱ_{t} = ϱ_{0} + \int_{0}^{t} ϱ_{s} γ (s) d W (s)$

where the Itô integral is a martingale (under measure ℙ) by the condition in (11.87). Because of the ℙ-martingale property, E[ϱt] = ϱ0 = e0 = 1, 0 ≤ t ≤ T. In particular, E[ϱT] = 1 and ϱT is also nonnegative. Hence, $ϱ \equiv \frac{d \hat{ℙ}}{d ℙ} = ϱ_{T}$ is a proper Radon-Nikodym derivative and by the ℙ-martingale property the process in (11.86) satisfies the definition in (11.82), i.e., it is indeed a Radon–Nikodym derivative process.

We now show that the process $\hat{W}$ defined by (11.88) is a standard $\hat{ℙ}$ -BM by verifying all the defining properties in Theorem 11.10 with measure $\hat{ℙ}$ (filtration ? fixed):

(i) The process starts at zero, $\hat{W} (0) = W (0) = 0,$ and is continuous in time since $\hat{W} (t) ≔ W (t) - \int_{0}^{t} γ (s) d s$ where W(t) and the integral $\int_{0}^{t} γ (s)$ ds are both continuous in t ≥ 0.
(ii) $d [\hat{W}, \hat{W}] (t) = d \hat{W} (t) d \hat{W} (t) = (d W (t) - γ (t) d t) (d W (t) - γ (t) d t) = d W (t) d W (t) = d t,$ i.e., the process has quadratic variation $[\hat{W}, \hat{W}] (t) = t .$
(iii) { ${\hat{W} (t)}_{0 \leq t \leq T}$ is a $\hat{ℙ}$ -martingale. By Proposition 11.12, this follows if we can show that the process ${\hat{W} (t)}_{0 \leq t \leq T}$ is a ℙ-martingale. To show the latter, we compute the stochastic differential by Itô's product rule (using dϱt = ϱtγ(t) dW(t) and $d \hat{W} (t) = d W (t) - γ (t) d t$ and setting dW(t) dW(t) = dt, dW(t) dt = 0):
$\begin{array}{l} d (ϱ_{t} \hat{W} (t)) & = & ϱ_{t} d \hat{W} (t) + \hat{W} (t) d ϱ_{t} + d ϱ_{t} d \hat{W} (t) \\ = & ϱ_{t} [d W (t) - γ (t) d t] + ϱ_{t} γ (t) \hat{W} (t) d W (t) + ϱ_{t} γ (t) d W (t) [d W (t) - γ (t) d t] \\ = & ϱ_{t} [1 + γ (t) \hat{W} (t)] d W (t) . \end{array}$

This is a stochastic differential with a zero drift term (i.e., the coefficient in dt is zero). In integral form, where $\hat{W} (0) = 0,$ we have

$ϱ_{t} \hat{W} (t) = \int_{0}^{t} ϱ_{s} [1 + γ (s) \hat{W} (s)] d W (s), 0 \leq t \leq T .$

By the assumed boundedness of $\int_{0}^{t} γ (s) d s,$ and the fact that the BM W(t) is bounded (a.s.), $\hat{W} (t)$ is bounded (a.s.) for all 0 ≤ t ≤ T. Combining this fact with the square-integrability condition (11.87), it follows that the above Itô integral is defined as it satisfies the square-integrability condition, $E [\int_{0}^{T} ϱ_{s}^{2} {[1 + γ (s) \hat{W} (s)]}^{2} d s] < \infty,$ and is hence a ℙ-martingale, i.e., ${\hat{W} (t)}_{0 \leq t \leq T}$ is a ℙ-martingale.

11.8.1 Some Applications of Girsanov’s Theorem

Let’s begin by considering a simple example of how Girsanov’s Theorem can be applied to change probability measures so as to eliminate the drift in a drifted Brownian process.

Example 11.14

Let X(t) ≡ W(μ,σ) (t) be a drifted BM process (recall (10.23))

$X (t) ≔ μ t + σ W (t),$

where {W(t)}t≤0 is a standard ℙ-BM. Find a measure under which {X(t)}0≤t≤T, for any T > 0, is a scaled BM with zero drift.

Solution. We note that the drift μ and volatility parameter σ > 0 are constants. Hence, by using Girsanov’s Theorem we define a measure $\hat{ℙ}, \frac{d \hat{ℙ}}{d ℙ} = ϱ_{T}$ , where ϱt is given by (11.90). Now, $\hat{W} (t) ≔ W (t) - γ t$ is a standard $\hat{ℙ}$ -BM and writing X(t) in terms of $\hat{W} (t)$ gives

$X (t) = μ t + σ W (t) = μ t + σ (\hat{W} (t) + γ t) = (μ + σ γ) t + σ \hat{W} (t) .$

So the drift coefficient of X(t) is now μ + σγ, while the volatility parameter multiplying the standard $\hat{ℙ}$ -BM is still σ. Note that we can also see this in stochastic differential form:

$d X (t) = μ d t + σ d W (t) = μ d t + σ (d \hat{W} (t) + γ d t) = (μ + σ γ) d t + σ d \hat{W} (t) .$

Hence, choosing γ = −μ/σ gives zero drift, μ + σγ = 0, and the process $X (t) = σ \hat{W} (t)$ is a zero-drift scaled BM under measure $\hat{ℙ}$ . The measure change $ℙ \to \hat{ℙ}, \frac{d \hat{ℙ}}{d ℙ} = ϱ_{T},$ is defined explicitly by the Radon–Nikodym derivative process

$ϱ_{t} \equiv {(\frac{d \hat{ℙ}}{d ℙ})}_{t} = \exp (- \frac{μ^{2} t}{2 σ^{2}} - \frac{μ}{σ} W (t)), 0 \leq t \leq T . (11.91)$

For the above example, we can also find the CDF of X(t) in the $\hat{ℙ}$ -measure, denoted by ${\hat{F}}_{X (t)}$ . It is instructive to see the two ways to obtain this CDF. One way is to simply use $X (t) = σ \hat{W} (t) \underline{\underline{d}} σ \sqrt{t} \hat{Z}, \hat{Z} ~ N o r m (0, 1)$ under measure $\hat{ℙ}$ :

${\hat{F}}_{X (t)} (x) \equiv \hat{ℙ} (X (t) \leq x) = \hat{ℙ} (\hat{Z} \leq \frac{x}{σ \sqrt{t}}) = N (\frac{x}{σ \sqrt{t}}) .$

The other way is to compute a ℙ-measure expectation using ϱt and apply the identity in (11.84) since $I_{{X (t) \leq x}} = I_{{μ t + σ W (t) \leq x}}$ is an ℱt-measurable random variable:

$\begin{array}{l} {\hat{F}}_{X (t)} (x) = \hat{E} [I_{{X (t) \leq x}}] = E [ϱ_{t} I_{{X (t) \leq x}}] & = & E [e^{- \frac{1}{2} γ^{2} t + γ W (t)} I_{{μ t + σ W (t) \leq x}}] \\ = & e^{- \frac{1}{2} γ^{2} t} E [e^{γ W (t)} I_{{W (t) \leq (x - μ t) / σ}}] \\ = & e^{- \frac{1}{2} γ^{2} t} \cdot e^{\frac{1}{2} γ^{2} t} N (\frac{x - μ t}{σ \sqrt{t}} - γ \sqrt{t}) \\ = & N (\frac{x - (μ + σ γ) t}{σ \sqrt{t}}) = N (\frac{x}{σ \sqrt{t}}) \end{array}$

where μ + σγ = 0. Note that here we used the expectation identity (A.2) in the Appendix where W(t) ~ Norm(0,t) under measure ℙ.

The CDF of X(t) in the ℙ-measure was already computed in Section 10.3.1, i.e.,

$F_{X (t)} (x) \equiv ℙ (X (t) \leq x) \equiv ℙ (W^{(μ, σ)} (t) \leq x) = N (\frac{x - μ t}{σ \sqrt{t}}) .$

We therefore see from the above two expressions for the CDF of the process at time t (in the two different measures) that the measure change $ℙ \to \hat{ℙ}$ eliminates the drift μt when γ = −μ/σ. Observe that X(t) ~ Norm(0,σ2t) under the $\hat{ℙ}$ -measure:

$\hat{E} [X (t)] = σ \hat{E} [\hat{W} (t)] = 0, \hat{E} [X^{2} (t)] = σ^{2} \hat{E} [{\hat{W}}^{2} (t)] = σ^{2} t .$

In contrast, X(t) ~ Norm(μt, σ2t) under the ℙ-measure.

In previous chapters we saw how measure changes are employed in discrete-time asset price models such as the binomial model. In particular, we discussed various risk-neutral measures. By using Girsanov’s Theorem, we can now consider our first example of how to construct a risk-neutral measure for a single stock GBM price process in continuous time.

Example 11.15

(Changing the drift in GBM) Assume a non-dividend-paying stock price process with SDE

$d S (t) = S (t) [μ d t + σ d W (t)],$

where {W(t)}t≤0 is a standard BM under the physical (real-world) measure ℙ, μ is a constant physical (i.e., historical) growth rate, and σ > 0 is a constant volatility. Find the risk-neutral probability measure $\tilde{ℙ}$ defined such that the discounted stock price process ${\bar{S} (t) ≔ e^{- r t} S (t)}_{0 \leq t \leq T,}$ for any T > 0, is a $\tilde{ℙ}$ -martingale, where r is a constant interest rate.

Solution. By the strong solution of the SDE

$\begin{array}{l} S (t) = S (0) e^{(μ - σ^{2} / 2) t + σ W (t)} \Rightarrow \bar{S} (t) = e^{- r t} S (t) & = & S (0) e^{(μ - r) t} \cdot e^{- σ^{2} t / 2 + σ W (t)} \\ \equiv & S (0) e^{(μ - r) t} \cdot ℰ_{t} (σ \cdot W) . \end{array}$

We recognize ${ε_{t} (σ \cdot W) ≔ e^{- σ^{2} t / 2 + σ W (t)}}_{t \geq 0}$ as a (exponential) ℙ-martingale with unit expectation, E[∈t(σ · W)] = 1 (see Example 10.2 in Chapter 10). So we now proceed to eliminate the drift μ − r by expressing W in terms of a new BM, $\tilde{W}$ , in the new measure $\tilde{ℙ}$ . Since μ − r and σ are constants, we can accomplish this by employing a measure change as in the above example:

$ϱ_{t} \equiv {(\frac{d \tilde{ℙ}}{d ℙ})}_{t} = e^{- \frac{1}{2} γ^{2} t + γ W (t)} \equiv ℰ_{t} (γ \cdot W), (11.92)$

where $\frac{d \tilde{ℙ}}{d ℙ} = ϱ_{T}$ and $\tilde{W} (t) = W (t) - γ t$ is a standard $\tilde{ℙ}$ -BM. Substituting $W (t) = \tilde{W} (t) + γ t$ into the above exponential expression gives

$\bar{S} (t) = S (0) e^{(μ - r) t} \cdot e^{- σ^{2} t / 2 + σ (\tilde{W} (t) + γ t)} = \bar{S} (0) e^{(μ - r + σ γ) t} \cdot ℰ_{t} (σ \cdot \tilde{W}) (11.93)$

where $\bar{S} (0) = S (0) .$ Note that ${ε_{t} (σ \cdot \tilde{W}) ≔ e^{- σ^{2} t / 2 + σ \tilde{W} (t)}}_{t \geq 0}$ is a $\tilde{ℙ}$ -martingale where:

$\tilde{E} [ℰ_{t} (σ \cdot \tilde{W}) | ℱ_{u}] = ℰ_{u} (σ \cdot \tilde{W}), u \leq t .$

Clearly, by setting γ = (r − μ)/σ, we have μ − r + σγ = 0 and this gives the unique measure change for eliminating the drift in (11.93), giving the discounted stock price process as a $\tilde{ℙ}$ -martingale, i.e.,

$\bar{S} (t) = \bar{S} (0) \cdot ℰ_{t} (σ \cdot \tilde{W}), 0 \leq t \leq T, (11.94)$

where

$\tilde{E} [\bar{S} (t) | ℱ_{u}] = \bar{S} (u), 0 \leq u \leq t \leq T . (11.95)$

In summary, the risk-neutral measure is the unique measure obtained with the Radon–Nikodym derivative process and measure change defined by (11.92) with γ = (r − μ)/σ:

${(\frac{d \tilde{ℙ}}{d ℙ})}_{t} = ℰ_{t} (\frac{(r - μ)}{σ} \cdot W), 0 \leq t \leq T; \frac{d \tilde{ℙ}}{d ℙ} = ℰ_{T} (\frac{(r - μ)}{σ} \cdot W) . (11.96)$

Note that the measure $\tilde{ℙ}$ is uniquely specified by (11.96), where γ = (r − μ)/σ always exists since σ > 0. We can also see directly how to choose the above measure change by working with the SDE where the Brownian increment $d W (t) = d \tilde{W} (t) + γ d t$ is used within the original SDE:

$d S (t) = S (t) [μ d t + σ (d \tilde{W} (t) + γ d t)] = S (t) [(μ + σ γ) d t + σ d \tilde{W} (t)] . (11.97)$

Taking the stochastic differential of $\bar{S} (t) \equiv e^{- r t} S (t)$ and using the above dS(t) term:

$\begin{array}{l} d \bar{S} (t) = d (e^{- r t} S (t)) & = & e^{- r t} [d S (t) - r S (t) d t] \\ = & e^{- r t} S (t) [(μ - r + σ γ) d t + σ d \tilde{W} (t)] \\ = & \bar{S} (t) [(μ - r + σ γ) d t + σ d \tilde{W} (t)] (11.98) \end{array}$

$\Rightarrow d \bar{S} (t) = σ \bar{S} (t) d \tilde{W} (t) (11.99)$

where the last expression with zero drift is obtained by choosing γ = (r − μ)/σ, i.e., by employing the measure change defined in (11.96). Note that the SDE in (11.99) with initial condition $\bar{S} (0)$ is equivalent to (11.94), which is its unique solution. For an arbitrary choice of γ the SDE with drift in (11.98) subject to initial condition $\bar{S} (0)$ is equivalent to (11.93), which is its unique solution. Finally, note that choosing γ = (r − μ)/σ in (11.97) gives the stock price drifting at the risk-free rate within the risk-neutral measure:

$d S (t) = S (t) [r d t + σ d \tilde{W} (t)] (11.100)$

with unique solution

$S (t) = S (0) e^{r t} \cdot ℰ_{t} (σ \cdot \tilde{W}) = S (0) e^{(r - σ^{2} / 2) t + σ \tilde{W} (t)} (11.101)$

equivalent to (11.94). The $\tilde{ℙ}$ -martingale property in (11.95) is equivalently expressed as

$\tilde{E} [S (t) | ℱ_{u}] = e^{r (t - u)} S (u), 0 \leq u \leq t \leq T . (11.102)$

In Example 11.14 we used Girsanov’s Theorem to obtain a new measure $\hat{ℙ}$ , defined by the Radon–Nikodym process in (11.91), such that the process X(t) ≡ W(μ,σ)(t) is a scaled standard $\hat{ℙ}$ -BM. We now employ the same measure change and thereby compute expectations and joint probabilities of events associated with the sampled maximum or minimum of BM with drift. In particular, let’s simply set σ = 1 and consider the process defined by (10.68) in Section 10.4.3, i.e.,

$X (t) = μ t + W (t) = \tilde{W} (t) .$

The expression in (11.91), for σ = 1, gives the Radon–Nikodym derivative for the change of measure $ℙ \to \hat{ℙ}, ϱ_{t} = {(\frac{d \hat{ℙ}}{d ℙ})}_{t} = e^{- \frac{1}{2} μ^{2} t - μ W (t)} .$ Hence, the Radon–Nikodym derivative for the change of measure $\hat{ℙ} \to ℙ$ is expressed in terms of the $\hat{W} (t) = W (t) + μ t,$ as

$\frac{1}{ϱ_{t}} = {(\frac{d ℙ}{d \hat{ℙ}})}_{t} = e^{\frac{1}{2} μ^{2} t + μ W (t)} = e^{- \frac{1}{2} μ^{2} t + μ \hat{W} (t)} . (11.103)$

Let A, B be any two Borel sets in ℝ and consider the $ℱ_{t}$ -measurable indicator random variables $I_{{M^{X} (t) \in A, X (t) \in B}}$ and $I_{{m^{X} (t) \in A, X (t) \in B}}$ where the respective sampled maximum, MX(t), and minimum, mX(t), of the drifted BM process X are defined in (10.70) and (10.71). That is,

$M^{X} (t) = \sup_{0 \leq u \leq t} X (u) = \sup_{0 \leq u \leq t} \hat{W} (u) \equiv M^{\hat{W}} (t)$

and

$m^{X} (t) = \inf_{0 \leq u \leq t} X (u) = \inf_{0 \leq u \leq t} \hat{W} (u) \equiv m^{\hat{W}} (t) .$

The sampled maximum M(t) ≡ MW(t) and minimum m(t) ≡ mW(t) of the standard ℙ-BM, W, are defined in (10.33) and (10.34). Applying the change of measure while using (11.103) within (11.84) gives

$\begin{array}{l} ℙ (M^{X} (t) \in A, X (t) \in B) & \equiv & E [I_{{M^{X} (t) \in A, X (t) \in B}}] \\ = & \hat{E} [ϱ_{t} I_{{M^{X} (t) \in A, X (t) \in B}}] \\ = & e^{- \frac{1}{2} μ^{2} t} \hat{E} [e^{μ \hat{W} (t)} I_{{M^{\hat{W}} (t) \in A, \hat{W} (t) \in B}}] \\ = & e^{- \frac{1}{2} μ^{2} t} E [e^{μ W (t)} I_{{M (t) \in A, W (t) \in B}}] . (11.104) \end{array}$

In the last equation line we simply removed all "hats" since the random variables $M^{\hat{W}} (t)$ and $\hat{W} (t)$ under measure $\hat{ℙ}$ are the same as M(t) and W(t) under measure ℙ. By the same steps as in (11.104) we have

$ℙ (m^{X} (t) \in A, X (t) \in B) = e^{- \frac{1}{2} μ^{2} t} E [e^{} I_{}] . (11.105)$

Equations (11.104) and (11.105) can be used to compute the probability of any joint event involving either pair MX(t), X(t) or mX(t), X(t). For example, taking intervals $A = (- \infty, m], B = (- \infty, x]$ gives the respective joint CDFs

$\begin{array}{l} F_{M^{X} (t), X (t)} (m, x) & ≔ & ℙ (M^{X} (t) \leq m, X (t) \leq x) \\ = & e^{- \frac{1}{2} μ^{2} t} E [e^{μ W (t)} I_{{M (t) \leq m, W (t) \leq x}}] (11.106) \end{array}$

and

$\begin{array}{l} F_{m^{X} (t), X (t)} (m, x) & ≔ & ℙ (m^{X} (t) \leq m, X (t) \leq x) \\ = & e^{- \frac{1}{2} μ^{2} t} E [e^{μ W (t)} I_{m (t) \leq m, W (t) \leq x}] . (11.107) \end{array}$

Expressing the expectation in (11.106) as an integral over the joint density of M(t), W(t):

$F_{M^{X} (t), X (t)} (m, x) = e^{- \frac{1}{2} μ^{2} t} \int_{0}^{m} \int_{- \infty}^{m} e^{μ y} f_{M (t), W (t)} (w, y) d y d w . (11.108)$

Differentiating, and making use of the known joint PDF of M(t), W(t) in (10.39), gives the joint PDF of MX(t),X(t)

$\begin{array}{l} f_{M^{X} (t), X (t)} (m, x) & = & e^{- \frac{1}{2} μ^{2} t + μ x} f_{M (t), W (t)} (m, x) \\ = & \frac{2 (2 m - x)}{t \sqrt{2 π t}} e^{- \frac{1}{2} μ^{2} t + μ x - {(2 m - x)}^{2} / 2 t,} (11.109) \end{array}$

for x ≤ m,m > 0 and zero otherwise. Similarly, the joint PDF of mX(t), X(t) follows from (11.107) and the joint PDF in (10.43),

$\begin{array}{l} f_{m^{X} (t), X (t)} (m, x) & = & e^{- \frac{1}{2} μ^{2} t + μ x} f_{m (t), W (t)} (m, x) \\ = & \frac{2 (x - 2 m)}{t \sqrt{2 π t}} e^{- \frac{1}{2} μ^{2} t + μ x - {(x - 2 m)}^{2} / 2 t,} (11.110) \end{array}$

for x ≤ m,m < 0, and zero otherwise. Other applications of (11.104) and (11.105) are given in Section 10.4.3.

11.9 Brownian Martingale Representation Theorem

Before moving on to the next section on multidimensional (vector) BM we state a result that we will later see has some theoretical importance in replication (hedging) and pricing derivative contracts within a continuous-time financial model driven by a single BM. We have already learned that, given an adapted process {X(t)}0≤t≤T with $\int_{0}^{T} E [X^{2} (t)] d t < \infty,$ the Itô process ${I (t) ≔ \int_{0}^{t} X (s) d W (s)}_{0 \leq t \leq T}$ is a (ℙ,?)-martingale where {W(t)}t≥0 is a (ℙ,?)-BM. A question that one may ask is: Are all (ℙ,?)-martingales expressible as an Itô process? It turns out that this is the case if we consider martingales that are square integrable and we also restrict the filtration to be the natural filtration generated by the BM, i.e., if $F = F^{W} = {F_{t}^{W}}_{t \geq 0} ≔ {σ (W (s) : 0 \leq s \leq t)}_{t \geq 0}$ . We summarize this in the following known theorem without proof.

Theorem 11.14

(Brownian Martingale Representation Theorem). Assume {M(t)}0≤t≤T is a (ℙ,?W)-martingale and that it is square integrable, i.e.,

$E [M^{2} (t)] < \infty, f o r e v e r y t \in [0, T] .$

Then, there exists an ?W -adapted process {θ(t)}0≤t≤T such that (a.s.)

$M (t) = M (0) + \int_{0}^{t} θ (u) d W (u) . (11.111)$

This theorem tells us that if a process is a square-integrable martingale, w.r.t. a given measure ℙ and natural filtration ?W generated by the standard ℙ-BM W, then it can be expressed as a sum of its initial value and an Itô integral in the ℙ-BM. The integrand of the Itô integral is a process that is adapted to ?W. Note that the Itô integral itself is a square-integrable (ℙ,?W)-martingale and also continuous in time. So the martingale having this representation is also continuous in time (i.e., the process has no jumps).

We are now ready to state a closely related result that is a consequence of the above theorem and will later be applicable to our discussion of derivative replication in Chapter 12. Let us consider what happens when we change measures $ℙ \to \hat{ℙ}$ as defined in Girsanov’s Theorem 11.13. As we already noted, ? could be any filtration for the ℙ-BM, W. Now set ? = ?W where {γ(t)}0≤t≤T is assumed to be ?W -adapted and clearly the time integral of this process occurring in (11.88) is ?W -adapted. In particular, the $σ (\int_{0}^{t} γ (s) d s) \subset ℱ_{t}^{W}$ for every t ≥ 0. Then, by the definition in (11.88), the σ-algebra $ℱ_{t}^{\hat{W}} ≔ σ (\hat{W} (u) : 0 \leq u \leq t) = ℱ_{t}^{W}$ . Hence, if {γ(t)}0≤t≤T is chosen as an ?W -adapted process, then the natural filtration $F^{\hat{W}} = {ℱ_{t}^{\hat{W}}}_{0 \leq t \leq T}$ , generated by $\hat{W}$ in (11.88), is equal to the natural filtration ?W, generated by W, i.e., $F^{W} = F^{\hat{W}}$ . In summary, by combining these facts with Theorem 11.14 we have the result below. This states that, if the change of measure $ℙ \to \hat{ℙ}$ is defined via Girsanov’s Theorem with an ?W -adapted process, then we can always express a square-integrable $(\hat{ℙ}, F^{W})$ -martingale as its initial value plus an Itô integral in the $\hat{ℙ}$ -BM.

Proposition 11.15.

Let the measure $\hat{ℙ}$ be defined as in Girsanov’s Theorem 11.13 with the assumption that the process {γ(t)}0≤t≤T is ?W -adapted. If {M(t)}0≤t≤T is a square-integrable $(\hat{ℙ}, F^{W})$ -martingale, then there exists an adapted process, say ${\hat{θ} (t)}_{0 \leq t \leq T}$ , such that (a.s.)

$M (t) = M (0) + \int_{0}^{t} \hat{θ} (u) d \hat{W} (u) . (11.112)$

Proof. By the above argument we have $F^{W} = F^{\hat{W}}$ . Hence, {M(t)}0≤t≤T is a square-integrable $(\hat{ℙ}, F^{\hat{W}})$ -martingale where $\hat{E} [M^{2} (t)] = E [ϱ_{t} M^{2} (t)] \leq E [M^{2} (t)] < \infty$ . It now follows trivially by Theorem 11.14 that there exists an $F^{\hat{W}}$ -adapted (and hence ?W-adapted) process ${\hat{θ} (t)}_{0 \leq t \leq T}$ such that (11.112) holds (a.s.).

11.10 Stochastic Calculus for Multidimensional BM

11.10.1 The Itô Integral and Itô's Formula for Multiple Processes on Multidimensional BM

We now extend the definition of one-dimensional standard BM {W(t)}t≥ into d dimensions for any finite integer d ≥ 1. As seen below, the extension to multiple dimensions is fairly straightforward as we take each component as an independent one-dimensional standard BM. Notation needs to be introduced to precisely denote each component BM and boldface is used for a vector BM.

Definition 11.4.

A standard BM in ℝd (or standard d-dimensional BM) is a vector process

$W (t) \equiv (W_{1} (t), W_{2} (t), ..., W_{d} (t)), t \geq 0,$

where each component process {Wi(t)}t≥0, 1 ≤ i ≤ d, is an independent one-dimensional standard BM in ℝ.

Hence, each component is i.i.d. where Wi(t) ~ Norm(0, t) and Wi(t) − Wi(s) ~ Norm(0, t − s), 1 ≤ i ≤ d, 0 ≤ s ≤ t. We call this a standard vector BM since, by construction, each component is an identical and independent copy of a one-dimensional standard BM. That is, Wi(t) and Wj(t) are independent if i ≠ j. A filtration $F = {ℱ_{t}}_{t \geq 0}$ is a filtration for standard d-dimensional BM if it is a filtration for each component BM, {Wi(t)}t≥0. The natural filtration for {W(t)}t≥0, denoted by ?W, is the filtration generated by all components of the standard d-dimensional BM. Given any filtration ? for {W(t)}t≥0, we must have that {W(t)}t≥0 is ?-adapted, i.e., W(t) is $ℱ_{t}$ -measurable, and that each Brownian vector increment W(t + s) − W(t) is independent of $ℱ_{t}$ for s, t ≥ 0.

Since each component is a standard BM, then we have the usual properties such as the quadratic variation formula for each 1 ≤ i ≤ d:

$[W_{i}, W_{i}] (t) = t \Rightarrow d [W_{i}, W_{i}] (t) \equiv d W_{i} (t) d W_{i} (t) = d t . (11.113)$

Moreover, [f, Wi](t) has zero covariation for any continuously differentiable function f(t). In particular, for each 1 ≤ i ≤ d,

$[t, W_{i}] (t) = 0 \Rightarrow d W_{i} (t) d t = 0. (11.114)$

The covariation of two independent Brownian motions is zero, i.e.,

$[W_{i}, W_{j}] (t) = 0 \Rightarrow d [W_{i}, W_{j}] (t) \equiv d W_{i} (t) d W_{j} (t) = 0, for i \neq j . (11.115)$

It is simple to see how this arises by considering a time partition {0 = t0, t1, . . . , tn = t} and forming the partial sum of products of individual Brownian increments:

$Q_{n}^{i, j} (t) ≔ \sum_{k = 1}^{n} (W_{i} (t_{k}) - W_{i} (t_{k - 1})) (W_{j} (t_{k}) - W_{j} (t_{k - 1}))$

for i ≠ j. Using the fact that the increments are all mutually independent with mean zero, E[Wi(tk) − Wi(tk−1)] = 0, for every k, then $E [Q_{n}^{i, j} (t)] = 0.$ Since all n terms in the sum are mutually independent, the variance of the sum is the sum of the individual variances. Using the independence of the product terms, where E[(Wi(tk) − Wi(tk−1))2] = E[(Wj(tk) − Wj(tk−1))2] = tk − tk−1, gives

$\begin{array}{l} Var (Q_{n}^{i, j} (t)) & = & \sum_{k = 1}^{n} E [{(W_{i} (t_{k}) - W_{i} (t_{k - 1}))}^{2}] E [{(W_{j} (t_{k}) - W_{j} (t_{k - 1}))}^{2}] \\ = & \sum_{k = 1}^{n} {(t_{k} - t_{k - 1})}^{2} \\ \leq & Δ_{n} \sum_{k = 1}^{n} (t_{k} - t_{k - 1}) = Δ_{n} (t_{n} - t_{0}) = Δ_{n} t \end{array}$

where $Δ_{n} ≔ \max_{k = 1, ... n} (t_{k} - t_{k - 1})$ is the maximum time increment over the partition. Clearly, $V a r (Q_{n}^{i, j} (t))) \to 0 as Δ_{n} \to 0,$ i.e., this implies that, for all t ≥ 0, the random variable $Q_{n}^{i, j} (t)$ converges to its expected value $E [Q_{n}^{i, j} (t)] = 0 as Δ_{n} \to 0$ and hence the co-variation $[W_{i}, W_{j}] (t) ≔ \lim_{Δ_{n} \to \infty} Q_{n}^{i, j} (t)$ must be zero for i ≠ j.

For convenience we summarize the above “basic rules” for the stochastic increments as follows:

$d W_{i} (t) d W_{j} (t) = δ_{i j} d t, d W_{i} (t) d t = 0, {(d t)}^{2} = 0, (11.116)$

where δij = 1 if i = j, and 0 if i ≠ j.

As in the case of standard BM in one dimension, there is a similar useful characterization of a standard d-dimensional BM due to Lévy which we state in the following lemma. The result can be proven based on multidimensional extensions of the Itô formula.

Theorem 11.16

(Lévy's Characterization of a Standard Multidimensional BM). Consider the vector-valued process {M(t)≔ M1(t), . . . , Md(t)}t⩾0 where each component process {Mi(t)}t≥0, 1 ≤ i ≤ d, is a continuous (ℙ, ?)-martingale starting at Mi(0) = 0 (a.s.) and having quadratic variation [Mi, Mi](t) = t, for all t ≥ 0. Also, assume [Mi, Mj](t) = 0 for i ≠ j. Then, {M(t)}t≥0 is a standard d-multidimensional (ℙ, ?)-BM.

According to this result, a vector process is a standard vector BM (in a given measure and filtration) if we can verify that every component process is a martingale with continuous paths starting at zero, has the same quadratic variation as a standard BM, and all covariations among different components are zero. Basically, this means that each component is an i.i.d. standard one-dimensional BM.

Let us fix a filtration $F = {ℱ_{t}}_{t \geq 0}$ for BM in ℝd for a given integer d ≥ 1. The formulae and concepts we developed in previous sections on the Itô integral, Itô's formula for a function of an Itô process and SDEs can be generalized to a multiple (vector) BM and multiple Itô processes that are driven by the vector BM in ℝd. Let us first discuss this extension for the case of BM in ℝ2, i.e., d = 2 where W(t) = (W1(t), W2(t)). We can have any number of Itô processes that can be represented as an Itô integral w.r.t. W(t) plus a drift term which is a Riemann (or Lebesgue) integral. Consider two Itô processes X ≡ {X(t)}t≥0 and Y ≡ {Y(t)}t≥0 which form a vector process, (X(t), Y(t))t≥0. Let µX(t) and µY(t) be $ℱ$ -adapted drift coefficients of processes X and Y, respectively. The diffusion or volatility coefficient vectors are $ℱ$ -adapted vectors in ℝ2 denoted by

$σ_{X} (t) = (σ_{X, 1} (t), σ_{X, 2} (t)) and σ_{Y} (t) = (σ_{Y, 1} (t), σ_{Y, 2} (t))$

for processes X and Y, respectively. The two processes have the representations:

$X (t) = X (0) + \int_{0}^{t} μ_{X} (u) d u + \int_{0}^{t} σ_{X} (u) \cdot dW (u), (11.117)$

$Y (t) = Y (0) + \int_{0}^{t} μ_{Y} (u) d u + \int_{0}^{t} σ_{Y} (u) \cdot d W (u) . (11.118)$

In each case, the first (Riemann or Lebesgue) integral is the drift term and the second integral is a sum of two Itô integrals; one is w.r.t. the first component of the volatility vector and the first BM and the second is w.r.t. the second component of the volatility vector and the second BM. That is, we define

$\int_{0}^{t} σ_{X} (u) \cdot d W (u) ≔ \int_{0}^{t} σ_{X, 1} (u) d W_{1} (u) + \int_{0}^{t} σ_{X, 2} (u) d W_{2} (u), (11.119)$

$\int_{0}^{t} σ_{Y} (u) \cdot d W (u) ≔ \int_{0}^{t} σ_{Y} (u) d W_{1} (u) + \int_{0}^{t} σ_{Y, 2} (u) d W_{2} (u) . (11.120)$

Given a time T > 0, throughout we shall assume the square integrability condition holds for the Itô integrals on all time intervals [0, t], 0 ≤ t ≤ T, i.e., given an adapted vector process {σ(t) = (σ1(t), σ2(t))}t≥0 then we assume

$E [{(\int_{0}^{T} σ (t) \cdot d W (t))}^{2}] = \int_{0}^{T} E [∥ σ (t) ∥^{2}] d t < \infty . (11.121)$

where ${|| σ (t) ||}^{2} \equiv \sum_{i = 1}^{d} σ_{i}^{2} (t)$ is the square magnitude of the volatility vector, e.g., for d = 2 then ${|| σ (t) ||}^{2} = σ_{1}^{2} (t) + σ_{2}^{2} (t)$ . This condition is equivalent to requiring $\int_{0}^{T} E [σ_{i}^{2} (t)] d t < \infty$ , for every component i, and it guarantees the martingale property,

$E [\int_{0}^{T} σ (s) \cdot d W (s) | ℱ_{t}] = \int_{0}^{t} σ (s) \cdot d W (s), 0 \leq t \leq T . (11.122)$

Hence, the d-dimensional Itô integrals have zero expectation. The equality in (11.121) is the Itô isometry formula for vector BM, which is a special case of the covariance formula,

$Cov (\int_{0}^{t} σ (s) \cdot d W (s), \int_{0}^{t} γ (s) \cdot d W (s)) = \int_{0}^{t} E [σ (s) \cdot γ (s)] d s, (11.123)$

where σ(t) and γ(t) are $ℱ_{t}$ -adapted d-dimensional vectors. This is readily derived by writing out the two Itô integrals as sums of (one-dimensional) Itô integrals, as in (11.119), and then using the covariance relation for each pair of Itô integrals. Also, we assume that any drift coefficient µ(t) is integrable,

$E [\int_{0}^{T} | μ (t) | d t] < \infty . (11.124)$

The Itô integrals in (11.119)−(11.120) are the one-dimensional Itô integrals w.r.t. a single standard BM which is taken as either W1 or W2. The Riemann (Lebesgue) integrals in (11.117) and (11.118) are continuous functions of time and therefore have zero quadratic variation. To obtain the quadratic variation of the X process, note that the quadratic variation of each Itô integral in (11.119),

$I_{X, 1} (t) ≔ \int_{0}^{t} σ_{X, 1} (u) d W_{1} (u) and I_{X, 2} (t) ≔ \int_{0}^{t} σ_{X, 2} (u) d W_{2} (u),$

is computed according to (11.11) (where W1 and W2 individually act as W):

$[I_{X, 1}, I_{X, 1}] (t) \equiv \int_{0}^{t} σ_{X, 1}^{2} (u) d u and [I_{X, 2}, I_{X, 2}] (t) \equiv \int_{0}^{t} σ_{X, 2}^{2} (u) d u . (11.125)$

Since [W1, W2](t) = 0, i.e., dW1(t) dW2(t) = 0, the covariation of the two integrals is zero: [IX, 1, IX, 2](t) = 0. Hence, the quadratic variation of the X process is the quadratic variation of the Itô integral in (11.119), which, in turn, is the sum of the two quadratic variations in (11.125):

$\begin{array}{l} [X, X] (t) = [I_{X, 1}, I_{X, 1}] (t) + [I_{X, 2}, I_{X, 2}] (t) = \int_{0}^{t} (σ_{X, 1}^{2} (u) + σ_{X, 2}^{2} (u)) d u \\ = \int_{0}^{t} ∥ σ_{X} (u) ∥^{2} d u . (11.126) \end{array}$

Similarly, the Y process has quadratic variation

$[Y, Y] (t) = \int_{0}^{t} (σ_{Y, 1}^{2} (u) + σ_{Y, 2}^{2} (u)) d u = \int_{0}^{t} ∥ σ_{Y} (u) ∥^{2} d u . (11.127)$

The stochastic differential forms of (11.126) and (11.127) are

$d [X, X] (t) = d X (t) d X (t) = (σ_{X, 1}^{2} (t) + σ_{X, 2}^{2} (t)) d t = ∥ σ_{X} (t) ∥^{2} d t, (11.128)$

$d [Y, Y] (t) = d Y (t) d Y (t) = (σ_{Y, 1}^{2} (t) + σ_{Y, 2}^{2} (t)) d t = ∥ σ_{Y} (t) ∥^{2} d t . (11.129)$

It is easier to obtain (11.128) and (11.129) by working directly with the stochastic differential forms of (11.117) and (11.118),

$\begin{array}{l} d X (t) = μ_{X} (t) d t + σ_{X} (t) \cdot d W (t) \equiv μ_{X} (t) d t + σ_{X, 1} (t) d W_{1} (t) + σ_{X, 2} (t) d W_{2} (t), \\ d Y (t) = μ_{Y} (t) d t + σ_{Y} (t) \cdot d W (t) \equiv μ_{Y} (t) d t + σ_{Y, 1} (t) d W_{1} (t) + σ_{Y, 2} (t) d W_{2} (t), \end{array}$

and then applying the rules in (11.116). For example, by squaring the differential dX(t) and setting the terms dt dW1(t) = dt dW2(t) = 0, dW1(t) dW2(t) = 0, (dt)2 = 0, and (dW1(t))2 = (dW2(t))2 = dt, we obtain

$\begin{array}{l} d X (t) d X (t) \equiv {(d X (t))}^{2} = {(μ_{X} (t) d t + σ_{X} (t) \cdot d W (t))}^{2} \\ = σ_{X} (t) \cdot σ_{X} (t) d t = ∥ σ_{X} (t) ∥^{2} d t . \end{array}$

This recovers the result in (11.128). A similar derivation based on squaring dY(t) gives (11.129). The covariation is also simpler to compute based on this differential approach. By multiplying the two stochastic differentials and applying the simple rules in (11.116),

$\begin{array}{l} d [X, Y] (t) = d X (t) d Y (t) & = & (μ_{X} (t) d t + σ_{X} (t) \cdot d W (t)) (μ_{Y} (t) d t + σ_{Y} (t) \cdot d W (t)) \\ = & (σ_{X} (t) \cdot d W (t)) (σ_{Y} (t) \cdot d W (t)) \\ = & σ_{X} (t) \cdot σ_{Y} (t) d t . (11.130) \end{array}$

The last equation line is obtained as follows:

$\begin{array}{l} (σ_{X} (t) \cdot d W (t)) (σ_{Y} (t) \cdot d W (t)) & = & \sum_{i = 1}^{d = 2} \sum_{j = 1}^{d = 2} σ_{X, i} (t) σ_{Y, j} (t) \underset{= δ_{i j} d t}{\underset{︸}{d W_{i} (t) d W_{j} (t)}} \\ = & (\sum_{i = 1}^{d = 2} σ_{X, i} (t) σ_{Y, i} (t)) d t = σ_{X} (t) \cdot σ_{Y} (t) d t . \end{array}$

The integral form of (11.130) gives the covariation of the two Itô processes,

$[X, Y] (t) = \int_{0}^{t} σ_{X} (u) \cdot σ_{Y} (u) d u = \int_{0}^{t} (σ_{X, 1} (u) σ_{Y, 1} (u) + σ_{X, 2} (u) σ_{Y, 2} (u)) d u .$

The Itô formula in (11.20) and (11.21) for a function of one Itô process, and time t, extends further to the slightly more general case of a function of two Itô processes and time t. We simply state this important result as a lemma (without proof). The main idea, and a simple way to remember the formula in (11.131), is to Taylor expand f(t, x, y) up to terms of order dt, (dx)2, (dy)2 and then replace ordinary variables x → X(t), y → Y(t) and ordinary differentials by their respective stochastic differentials: dx → dX(t), dy → dY(t), (dx)2 → (dX(t))2 ≡ d[X, X](t), (dy)2 → (dY(t))2 ≡ d[Y, Y](t), and dx dy → dX(t) dY(t) ≡ d[X, Y](t).

Lemma 11.17

(Itô Formula for a Function of Two Processes). Assume f(t, x, y) is a C1, 2, 2 function on ℝ+ × ℝ2, i.e., having continuous derivatives $f_{t} \equiv \frac{\partial f}{\partial t}, f_{x} \equiv \frac{\partial f}{\partial x}, f_{y} \equiv \frac{\partial f}{\partial y}, f_{x x} \equiv \frac{\partial^{2} f}{\partial x^{2}}, f_{x y} \equiv \frac{\partial^{2} f}{\partial x \partial y} a n d f_{y y} \equiv \frac{\partial^{2} f}{\partial y^{2}} .$ Let the processes X and Y be Itô processes as given in (11.117) and (11.118). Then, the process defined by F(t) ≔ f(t, X(t), Y(t)), t ≥ 0, has stochastic differential dF(t) ≡ df(t, X(t), Y(t)) given by

$\begin{array}{l} d f (t, X (t), Y (t)) = & f_{t} (t, X (t), Y (t)) d t + f_{x} (t, X (t), Y (t)) d X (t) + f_{y} (t, X (t), Y (t)) d Y (t) \\ + \frac{1}{2} f_{x x} (t, X (t), Y (t)) d [X, X] (t) + \frac{1}{2} f_{y y} (t, X (t), Y (t)) d [Y, Y] (t) \\ + f_{x y} (t, X (t), Y (t)) d [X, Y] (t) . (11.131) \end{array}$

In integral form,

$f (t, X (t), Y (t)) = f (0, X (0), Y (0)) (11.132)$

$\begin{array}{l} + \int_{0}^{t} [f_{u} (u, X (u), Y (u)) + \frac{1}{2} ∥ σ_{X} (u) ∥^{2} f_{x x} (u, X (u), Y (u)) \\ + \frac{1}{2} ∥ σ_{Y} (u) ∥^{2} f_{y y} (u, X (u), Y (u)) + σ_{X} (u) \cdot σ_{Y} (u) f_{x y} (u, X (u), Y (u))] d u \\ + \int_{0}^{t} f_{x} (u, X (u), Y (u)) d X (u) + \int_{0}^{t} f_{y} (u, X (u), Y (u)) d Y (u) . (11.133) \end{array}$

It should be remarked (and we shall see later when we present the general form of the Itô formula for functions of multiple processes driven by multiple Brownian motions) that this lemma is generally valid for any number d ≥ 1 of underlying Brownian motions, although we have focused our present discussion on taking d = 2 as the base case. For d ≥ 2 the volatilities are d-dimensional vectors and the standard BM is a d-dimensional vector (standard) BM. For the case that d = 1 we simply have the vectors becoming scalars, e.g., γX(t) → σX (t), σY(t) → σ(t), and W(t) → W(t).

Observe that the first integral in (11.133) is a Riemann (or Lebesgue) integral on the time interval [0, t], whereas the second and third integrals are stochastic integrals w.r.t. the Itô processes X in (11.117) and Y in (11.118). The representation of df(t, X(t), Y(t)) in (11.131) and its corresponding integral form in (11.133) is written in terms of the stochastic differentials of X and Y. The Itô formula is also equivalently rewritten by substituting the above stochastic differentials for dX(t) and dY (t). Then, (11.131) takes the form

$\begin{array}{l} d f & = (f_{t} + μ_{X} (t) f_{x} + μ_{Y} (t) f_{y} + \frac{1}{2} {|| σ_{X} (t) ||}^{2} f_{x x} + \frac{1}{2} {|| σ_{Y} (t) ||}^{2} f_{y y} + σ_{X} (t) \cdot σ_{Y} (t) f_{x y}) d t \\ + (f_{x} σ_{X} (t) + f_{y} σ_{Y} (t)) \cdot d W (t) \\ \equiv μ_{f} (t) d t + σ_{f} (t) \cdot d W (t) \end{array} (11.134)$

where f ≡ f(t, X(t),Y (t)), fx ≡ fx(t, X(t), Y (t)), etc., is used to compact the expressions. In the second equation line we simply identified the drift μf (t) and volatility vector σf (t) for the process {f(t, X(t), Y (t))}t⩾0. We see that μf(t) and σf(t) are adapted processes defined explicitly as functions of f(t, X(t), Y (t)) and its partial derivatives, as well as functions of linear combinations of the drift and volatility vector coefficients of processes X and Y. In particular, the volatility vector σf(t) := fx σX(t) + fy σ Y (t) = (σf,1(t),σf,2(t)) has components

$σ_{f, 1} (t) = f_{x} σ_{X, 1} (t) + f_{y} σ_{Y, 1} (t), σ_{f, 2} (t) = f_{x} σ_{X, 2} (t) + f_{y} σ_{Y, 2} (t) . (11.135)$

Hence, {F(t)}t≥0 ≡ {f(t, X(t), Y (t))}t≥0 is an Itô process satisfying the stochastic integral equation

$\begin{array}{l} F (t) & = F (0) + \int_{0}^{t} μ_{f} (u) d u + \int_{0}^{t} σ_{f} (u) \cdot d W (u) \\ \equiv F (0) + \int_{0}^{t} μ_{f} (u) d u + \int_{0}^{t} σ_{f, 1} (u) d W_{1} (u) + \int_{0}^{t} σ_{f, 2} (u) d W_{2} (u) . \end{array} (11.136)$

The following example shows that the Itô Product Rule, derived previously, now follows simply by applying the Itô formula in (11.131).

Example 11.16.

Let {X(t)}t⩾0 and {Y (t)}t⩾0 be Itô processes. Obtain the stochastic differential of their product.

Solution. Defining the function f(t, x, y) := xy gives the product F(t) := f(t, X(t), Y (t)) = X(t) Y (t) as an Itô process whose stochastic differential is given according to (11.131). In this case the function is independent of t and has derivatives:

$f_{t} = 0, f_{x} = y, f_{y} = x, f_{x y} = 1, f_{x x} = f_{y y} = 0.$

Substituting these terms into (11.131) (with x = X(t), y = Y (t)) gives

$\begin{array}{l} d (X (t) Y (t)) & \equiv d f (t, X (t), Y (t)) = 0 \cdot d t + Y (t) d X (t) + X (t) d Y (t) + \frac{1}{2} \cdot 0 \cdot d X (t) d X (t) \\ + \frac{1}{2} \cdot 0 \cdot d Y (t) d Y (t) + 1 \cdot d X (t) d Y (t) \\ = Y (t) d X (t) + X (t) d Y (t) + d X (t) d Y (t) . \end{array} (11.137)$

Assuming X(t)Y (t) ≠ 0, we note that a useful way to represent this is to divide by X(t)Y (t) (i.e., factor out the product), giving the relative differential

$\frac{d F (t)}{F (t)} \equiv \frac{d (X (t) Y (t))}{X (t) Y (t)} = \frac{d X (t)}{X (t)} + \frac{d Y (t)}{Y (t)} + \frac{d X (t)}{X (t)} \frac{d Y (t)}{Y (t)} . (11.138)$

We can also write this in the form of (11.134):

$\begin{array}{l} \frac{d (X (t) Y (t))}{X (t) Y (t)} & = (\frac{μ_{X} (t)}{X (t)} + \frac{μ_{Y} (t)}{Y (t)} + \frac{σ_{X} (t)}{X (t)} \cdot \frac{σ_{Y} (t)}{Y (t)}) d t + (\frac{σ_{X} (t)}{X (t)} + \frac{σ_{Y} (t)}{Y (t)}) \cdot d W (t) \\ \equiv μ_{X Y} (t) d t + σ_{X Y} (t) \cdot d W (t) . \end{array} (11.139)$

This shows how the drift μXY (t) and volatility vector σXY (t) for the product process F = XY are related to the drifts and volatility vectors of the processes X and Y.

Another important example is the Quotient Rule for the stochastic differential of a ratio of two Itô processes. This rule is useful when pricing derivatives where we need to compute the drift and volatility of a process defined by a ratio of two asset price processes.

Example 11.17.

Let {X(t)}t⩾0 and {Y (t)}t⩾0 ≠ 0 be Itô processes. Obtain the stochastic differential of their ratio $F (t) : = \frac{X (t)}{Y (t)}$ .

Solution. Let f(t, x, y) := x/y, i.e., f(t, X(t),Y (t)) = X(t)/Y (t) is an Itô process with its stochastic differential given by (11.131). The relevant partial derivatives are

$f_{t} = 0, f_{x} = \frac{1}{y}, f_{y} = - \frac{x}{y^{2}}, f_{x y} = - \frac{1}{y^{2}}, f_{y y} = \frac{2 x}{y^{3}}, f_{x x} = 0.$

Substituting these terms into (11.131) (with x = X(t),y = Y (t)) gives

$\begin{array}{l} d (\frac{X (t)}{Y (t)}) & \equiv d F (t) \equiv d f (t, X (t), Y (t)) \\ = \frac{1}{Y (t)} d X (t) - \frac{X (t)}{Y^{2} (t)} d Y (t) + \frac{X (t)}{Y^{3} (t)} {(d Y (t))}^{2} - \frac{1}{Y^{2} (t)} d X (t) d Y (t) \\ = (\frac{d X (t)}{Y (t)} - \frac{X (t)}{Y (t)} \frac{d Y (t)}{Y (t)}) (1 - \frac{d Y (t)}{Y (t)}) . \end{array} (11.140)$

This is written in a more convenient form (for later use) by dividing through by X(t)/Y (t):

$\frac{d F (t)}{F (t)} \equiv \frac{d \frac{X (t)}{Y (t)}}{\frac{X (t)}{Y (t)}} = (\frac{d X (t)}{X (t)} - \frac{d Y (t)}{Y (t)}) (1 - \frac{d Y (t)}{Y (t)}) . (11.141)$

Substituting the expressions for dX(t) and dY (t), applying the basic rules, and combining terms in dt and dW(t) gives the form in (11.134) as

$\begin{array}{l} \frac{d \frac{X (t)}{Y (t)}}{\frac{X (t)}{Y (t)}} & = (\frac{μ_{X} (t)}{X (t)} - \frac{μ_{Y} (t)}{Y (t)} + \frac{σ_{Y} (t)}{Y (t)} \cdot (\frac{σ_{Y} (t)}{Y (t)} - \frac{σ_{X} (t)}{X (t)})) d t + (\frac{σ_{X} (t)}{X (t)} - \frac{σ_{Y} (t)}{Y (t)}) \cdot d W (t) \\ \equiv μ_{\frac{X}{Y}} (t) d t + σ_{\frac{X}{Y}} (t) \cdot d W (t) . \end{array} (11.142)$

This gives the drift $μ_{\frac{X}{Y}} (t)$ and volatility vector $σ_{\frac{X}{Y}} (t)$ for the quotient process $F = \frac{X}{Y}$ in terms of the drifts and volatility vectors of the individual processes X and Y.

The Itô product and quotient rules in (11.139) and (11.142) take on a more compact form if the processes X and Y can be represented in terms of the so-called log-drifts and log-volatility vectors (sometimes also referred to as local drift and local volatility). It will turn out to be particularly convenient when we later model the asset (e.g., stock) price processes. That is, assume processes X and Y satisfy the SDEs

$\frac{d X (t)}{X (t)} = μ_{X} (t) d t + σ_{X} (t) \cdot d W (t), (11.143)$

$\frac{d Y (t)}{Y (t)} = μ_{Y} (t) d t + σ_{Y} (t) \cdot d W (t), (11.144)$

where μX (t),μY (t), σX (t), σY (t) are ℱt-adapted log-drifts and log-volatility vectors. Note that these SDEs are quite general. The difference is that the previous coefficients are related to these “log-coefficients” by sending the previous coefficients μX (t) → μX (t)X(t), μY (t) → μY (t)Y (t), σX (t) → σX (t)X(t), σY (t) → σY (t)Y (t). The Itô formula applied to (11.143) and (11.144) still gives (11.138) and (11.140). However, now the terms occurring in (11.139) and (11.142) simplify, where we replace the previous ratios $\frac{μ_{X} (t)}{X (t)} \to μ_{X} (t), \frac{μ_{Y} (t)}{Y (t)} \to μ_{Y} (t), \frac{σ_{X} (t)}{X (t)} \to σ_{X} (t), \frac{σ_{Y} (t)}{Y (t)} \to σ_{Y} (t)$ , giving

$\begin{array}{l} \frac{d (X (t) Y (t))}{X (t) Y (t)} & = (μ_{X} (t) + μ_{Y} (t) + σ_{X} (t) \cdot σ_{Y} (t)) d t + (σ_{X} (t) + σ_{Y} (t)) \cdot d W (t) \\ \equiv μ_{X Y} (t) d t + σ_{X Y} (t) \cdot d W (t) . \end{array} (11.145)$

Here, μXY (t) and σXY (t) denote the log-drift and log-volatility vector of process XY and

$\begin{array}{l} \frac{d \frac{X (t)}{Y (t)}}{\frac{X (t)}{Y (t)}} & = (μ_{X} (t) - μ_{Y} (t) + σ_{Y} (t) \cdot (σ_{Y} (t) - σ_{X} (t))) d t + (σ_{X} (t) - σ_{Y} (t)) \cdot d W (t) \\ \equiv μ_{\frac{X}{Y}} (t) d t + σ_{\frac{X}{Y}} (t) \cdot d W (t) \end{array} (11.146)$

where $μ_{\frac{X}{Y}} (t)$ and $σ_{\frac{X}{Y}} (t)$ denote the log-drift and log-volatility vector of process $\frac{X}{Y}$ .

For d = 1, the SDEs in (11.143)–(11.146) are all of the form in (11.26) where all volatility coefficients are scalars. The processes can therefore be represented as in (11.27). Given initial values X(0) and Y(0):

$\begin{array}{r} X (t) & = X (0) \exp [\int_{0}^{t} (μ_{X} (s) - \frac{1}{2} σ_{X}^{2} (s)) d s + \int_{0}^{t} σ_{X} (s) d W (s)], \\ Y (t) & = Y (0) \exp [\int_{0}^{t} (μ_{Y} (s) - \frac{1}{2} σ_{Y}^{2} (s)) d s + \int_{0}^{t} σ_{Y} (s) d W (s)], \\ X (t) Y (t) & = X (0) Y (0) \exp [\int_{0}^{t} (μ_{X Y} (s) - \frac{1}{2} σ_{X Y}^{2} (s)) d s + \int_{0}^{t} σ_{X Y} (s) d W (s)], \\ \frac{X (t)}{Y (t)} & = \frac{X (0)}{Y (0)} \exp [\int_{0}^{t} (μ_{\frac{X}{Y}} (s) - \frac{1}{2} σ_{\frac{X}{Y}}^{2} (s)) d s + \int_{0}^{t} σ_{\frac{X}{Y}} (s) d W (s)] . \end{array}$

The reader can verify that the above third equation obtains by multiplying the expressions in the first and second equations, while the fourth equation obtains by dividing the expressions in the first and second equations. In the special case that the log-drift and log-volatility vectors are nonrandom (constants or ordinary functions of time t) the above processes are all GBM processes.

The above representations readily extend to the general vector case of d ⩾ 1. Consider the X process. Its natural logarithm has SDE:

$\begin{array}{l} d \ln X (t) & = \frac{d X (t)}{X (t)} - \frac{1}{2} {(\frac{d X (t)}{X (t)})}^{2} \\ = μ_{X} (t) d t + σ_{X} (t) \cdot d W (t) - \frac{1}{2} {(σ_{X} (t) \cdot d W (t))}^{2} \\ = (μ_{X} (t) - \frac{1}{2} {|| σ_{X} (t) ||}^{2}) d t + σ_{X} (t) \cdot d W (t) . \end{array}$

In integral form,

$\ln \frac{X (t)}{X (0)} = \int_{0}^{t} (μ_{X} (s) - \frac{1}{2} {|| σ_{X} (s) ||}^{2}) d s + \int_{0}^{t} σ_{X} (s) \cdot d W (s) .$

By exponentiating, $\frac{X (t)}{X (0)} = \exp (\ln \frac{X (t)}{X (0)})$ ,

$X (t) = X (0) \exp [\int_{0}^{t} (μ_{X} (s) - \frac{1}{2} {|| σ_{X} (s) ||}^{2}) d s + \int_{0}^{t} σ_{X} (s) \cdot d W (s)] . (11.147)$

This expresses X(t) in the general case of d-dimensional BM and reduces to the above expression in case d = 1. Similar expressions hold for the other processes. In fact, given an adapted drift μ(t) and volatility vector σ(t) (satisfying the above integrability assumptions), the SDE

$\frac{d U (t)}{U (t)} = μ (t) d t + σ (t) \cdot d W (t) (11.148)$

with initial value U(0) is equivalent to the representation

$\begin{array}{l} U (t) & = U (0) \exp [\int_{0}^{t} (μ (s) - \frac{1}{2} {|| σ (s) ||}^{2}) d s + \int_{0}^{t} σ (s) \cdot d W (s)] \\ = U (0) e^{\int_{0}^{t} μ (s) d s} \cdot ε_{t} (σ \cdot W) \end{array} (11.149)$

where the vector BM version of the stochastic exponential in (11.28) is defined by

$ε_{t} (σ \cdot W) : = \exp [- \frac{1}{2} \int_{0}^{t} {|| σ (s) ||}^{2} d s + \int_{0}^{t} σ (s) \cdot d W (s)] . (11.150)$

Hence, each process satisfying (11.143)–(11.146) has an equivalent representation as in (11.149). For the X process we see that (11.147) has precisely the form in (11.149). For processes Y, XY, and X/Y the same form obtains in the obvious manner where the corresponding drifts and volatility vectors, for the respective processes, are substituted into (11.149).

We now further extend our discussion to arbitrary dimensions d ⩾ 1. As already noted, all the formulae presented so far are valid for d ⩾ 1. Given a d-dimensional ℱt-adapted vector γ(t)=(γ1(t),...,γd(t)), the Itô integral w.r.t. d-dimensional BM is defined by the sum of one-dimensional Itô integrals,

$\int_{0}^{t} γ (s) \cdot d W (s) : = \sum_{i = 1}^{d} \int_{0}^{t} γ_{i} (s) d W_{i} (s) . (11.151)$

Throughout we shall assume that all such Itô integrals are square-integrable martingales for all times 0 ⩽ t ⩽ T, given some T > 0, where (11.121)–(11.123) hold.

Let ${X (t) \equiv (X_{1} (t), X_{2} (t), ..., X_{n} (t))}_{t ⩾ 0} \in ℝ, n ⩾ 1$ , be an n-dimensional Itô vector process where each component is a real-valued Itô process driven by a d-dimensional BM:

$\begin{array}{l} d X_{i} (t) & = μ_{i} (t) d t + \sum_{j = 1}^{d} σ_{i j} (t) d W_{j} (t) \\ \equiv μ_{i} (t) d t + σ_{i} (t) \cdot d W (t) \end{array} (11.152)$

with corresponding integral form

$\begin{array}{l} X_{i} (t) & = \int_{0}^{t} μ_{i} (s) d s + \sum_{j = 1}^{d} \int_{0}^{t} σ_{i j} (s) d W_{j} (s) \\ \equiv \int_{0}^{t} μ_{i} (s) d s + \int_{0}^{t} σ_{i} (s) \cdot d W (s) \end{array} (11.153)$

for i = 1,...,n. Each {μi(t)}t⩾0 is an integrable adapted process. The coefficients σij(t) are adapted and satisfy the square-integrability condition, $\int_{0}^{T} E [σ_{i j}^{2} (s)] d s < \infty$ . The n × d matrix of coefficients σ(t) := [σij(t)]i=1,...,n; j=1,...,d is the matrix of volatilities where the ith row gives the volatility vector σi(t) of the ith process Xi:

$σ_{i} (t) = (σ_{i 1} (t), σ_{i 2} (t), ..., σ_{i d} (t)), i = 1, ..., n .$

The jth component of σi(t) is the volatility coefficient σij (t).

Being Ft-adapted, the coefficients μi(t) and σij(t) can generally depend on the entire path of the vector process X up to time t, e.g., they can be functionals of the Brownian path {W(s) : 0 ⩽ s ⩽ t}. Generally this can be an intractable situation. However, an important case (and one that leads to some tractable models) is when these coefficients are known (defined) functions of the endpoint value of the process: μi(t) := μi(t, X(t)) and σi(t) := σi(t, X(t)). In this case, we say that the coefficients are state dependent and the vector process {X(t)}t⩾0 is a vector-valued diffusion process with each component process solving an SDE of the form

$d X_{i} (t) = μ_{i} (t, X (t)) d t + σ_{i} (t, X (t)) \cdot d W (t) . (11.154)$

We will return to this case later.

We now turn to the more general multidimensional version of the Itô formula by extending Lemma 11.17 to the case of a function of time and any n ⩾ 1 processes. In preparation, we already computed the quadratic variation and the covariation of two Itô processes driven by a vector BM (see (11.126), (11.127), and (11.130)). In particular, any component process Xi has quadratic variation

$[X_{i}, X_{i}] (t) = \int_{0}^{t} {|| σ_{i} (u) ||}^{2} d u = \sum_{j = 1}^{d} \int_{0}^{t} σ_{i j}^{2} (u) d u (11.155)$

which is written equivalently in differential form as

$d [X_{i}, X_{i}] (t) \equiv d X_{i} (t) d X_{i} (t) \equiv {(d X_{i} (t))}^{2} = {|| σ_{i} (t) ||}^{2} d t . (11.156)$

Any pair of processes Xi, Xj has covariation

$[X_{i}, X_{j}] (t) = \int_{0}^{t} σ_{i} (u) \cdot σ_{j} (u) d u, (11.157)$

or in differential form,

$d [X_{i}, X_{j}] (t) \equiv d X_{i} (t) d X_{j} (t) = σ_{i} (t) \cdot σ_{j} (t) d t . (11.158)$

It is also useful to write the above as d[Xi, Xj](t)= Cij (t)dt by defining the coefficients

$C_{i j} (t) : = σ_{i} (t) \cdot σ_{j} (t) = \sum_{k = 1}^{n} σ_{i k} (t) σ_{j k} (t) . (11.159)$

These are the elements of an n × n matrix C(t) := [Cij(t)]i,j=1,...,n, where Cij(t) are related to the instantaneous covariances between the differential increments of the two processes Xi and Xj. In terms of the matrix σ(t), the instantaneous covariances are given by Cij(t) = (σ(t) σ(t)T)ij where σ(t)T is the d × n transpose of the matrix σ(t). Given the instantaneous covariances, we also define the instantaneous correlations

$ρ_{i j} (t) : = \frac{σ_{i} (t) \cdot σ_{j} (t)}{|| σ_{i} (t) || || σ_{j} (t) ||} = \frac{C_{i j} (t)}{|| σ_{i} (t) || || σ_{j} (t) ||} . (11.160)$

[Remark: We can see, at least heuristically, that these coefficients are a measure of the instantaneous correlations between the increments of a pair of processes Xi and Xj when we divide the differential of the covariation with the square root of the product of the differentials of the variations, $\frac{d [X_{i}, X_{j}] (t)}{\sqrt{d [X_{i}, X_{i}] (t) \cdot d [X_{j}, X_{j}]} (t)} = \frac{C_{i j} (t)}{|| σ_{i} (t) || || σ_{j} (t) ||} = ρ_{i j} (t)$ .]

As in the case of Lemma 11.17, the main idea, which also gives us a simple way to remember the Itô formula for smooth functions f(t, x1,...,xn) of n variables and time t, is to apply a Taylor expansion of f up to terms of first order in the time increment dt, and up to second order in the increments dx1,..., dxn. The stochastic differential form of the Itô formula is then obtained upon replacing the ordinary variables Xi → Xi(t), and replacing all ordinary differentials by the respective stochastic differentials, dxi → dXi(t), dxi dxj → d[Xi,Xj](t), i, j = 1,..., n. We then also write the formula more explicitly by applying the basic rules in (11.116). The formal proof (which we omit) follows in the same manner as the proof for the two variable case in Lemma 11.17. Here we simply state this important result as a lemma.

Lemma 11.18

(Itô Formula for a Function of Several Processes). Let the vector-valued process {X(t) ≡ (X1(t), X2 (t),...,Xn(t))}t⩾0 satisfy the SDE in (11.152) and assume f(t, x) ≡ f(t, x1 ,...,xn) is a real-valued function on $ℝ_{+} \times ℝ^{n}$ that is continuously differentiable with respect to time t and twice continuously differentiable with respect to the n variables x1,...,xn. Then, the process defined by F(t) := f(t, X(t)) ≡ f(t, X1 (t),...,Xn(t)), t ⩾ 0, is an Itô process with stochastic differential dF (t) ≡ df(t, X(t)) given by

$\begin{array}{l} d F (t) & = \frac{\partial f}{\partial t} (t, X (t)) d t + \sum_{i = 1}^{n} \frac{\partial f}{\partial x_{i}} (t, X (t)) d X_{i} (t) \\ + \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} \frac{\partial^{2} f}{\partial x_{i} \partial x_{j}} (t, X (t)) d [X_{i}, X_{j}] (t) \end{array} (11.161)$

Note that the formula in (11.131) is recovered in the special case that n = 2 by setting X1 (t) ≡ X(t),X2 (t) ≡ Y(t). The stochastic differential in (11.161) has meaning when written as an Itô process in integral form. Using (11.158), (11.161) takes the form

This expression involves the time differential and a linear combination of stochastic differentials in the component processes. By further making use of (11.152), we obtain Itô’s formula, extending (11.134), where the stochastic differential is written in terms of the vector BM increment:

$\begin{array}{l} d f (t, X (t)) & = [\frac{\partial f}{\partial t} (t, X (t)) + \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} C_{i j} (t) \frac{\partial^{2} f}{\partial x_{i} \partial x_{j}} (t, X (t)) + \sum_{i = 1}^{n} μ_{i} (t) \frac{\partial f}{\partial x_{i}} (t, X (t))] d t \\ + \sum_{i = 1}^{n} \frac{\partial f}{\partial x_{i}} (t, X (t)) σ_{i} (t) \cdot d W (t) \\ \equiv μ_{f} (t) d t + σ_{f} (t) \cdot d W (t) . \end{array} (11.163)$

In the second equation line we identified the drift and volatility vector of the process {f(t, X(t))}t⩾0.

11.10.2 Multidimensional SDEs, Feynman–Kac Formulae, and Transition CDFs and PDFs

The main concepts, theorems, and formulae that we established in Section 11.7 for the case of a single process driven by a one-dimensional BM also carry over into the multidimensional case with appropriate assumptions in place. Here we only give a very brief account of some of the relevant results. Our main starting point is the n-dimensional diffusion process solving the system of SDEs in (11.154), i.e.,

$d X_{i} (t) = μ_{i} (t, X (t)) d t + \sum_{j = 1}^{d} σ_{i j} (t, X (t)) d W_{j} (t), i = 1, ..., n . (11.164)$

In integral form,

$X_{i} (t) = X_{i} (0) + \int_{0}^{t} μ_{i} (s, X (s)) d s + \sum_{j = 1}^{d} \int_{0}^{t} σ_{i j} (s, X (s)) d W_{j} (s), i = 1, ..., n . (11.165)$

The drift and volatility coefficients, μi(t, x) and σij(t, x), are given functions of time and variables x =(x1,...,xn).

Theorem 11.4, which provides sufficient conditions on the existence and uniqueness of a strong solution to the one-dimensional SDE (11.24), extends in a similar manner to the above system of SDEs. The absolute values for the Lipschitz condition and the linear growth condition on the coefficients are now replaced by appropriate vector and matrix norms. We denote the drift vector by μ(t, x) = (μ1 (t, x),...,μn(t, x)). The norm of a vector $v \in ℝ^{n}$ is $|| v || : = \sqrt{\sum_{i = 1}^{n} υ_{i}^{2}}$ and the norm of a matrix A with elements aij is defined similarly as $|| A || : = \sqrt{\sum_{i, j} a_{i j}^{2}}$ . If there is a constant K > 0 such that the Lipschitz condition

$|| μ (t, x) - μ (t, y) || + || σ (t, x) - σ (t, y) || ⩽ K || x - y ||$

and the linear growth condition

$|| μ (t, x) || + || σ (t, x) || ⩽ K (1 + || x ||)$

are both satisfied, for $x, y \in ℝ^{n}$ , then this ensures that there is a unique vector process {X(t)}t⩾0 solving (11.165) and that the paths of the vector process are continuous in time. As in the one-dimensional case, these conditions are not necessary, but are sufficient conditions to guarantee the existence of a unique strong solution.

The solution to (11.165) is also a vector Markov process, i.e.,

$ℙ (X (t) ⩽ y | ℱ_{s}) = ℙ (X (t) ⩽ y | X (s)) (11.166)$

for all $y \in ℝ^{n}$ or, for Borel function $h : ℝ^{n} \to ℝ$ ,

$E [h (X (t)) | ℱ_{s}] = E [h (X (t)) | X (s)], 0 ⩽ s ⩽ t . (11.167)$

The conditional expectation of h(X(T)), given the vector value $X (t) = x \in ℝ^{n}$ at a time t ⩽ T , is denoted by

$E_{t, x} [h (X (T))] : = E [h (X (T)) | X (t) = x] .$

As in the scalar case, the subscripts t, x are shorthand for conditioning on a given vector value X(t)= x at time t. The conditional expectation is a function of the ordinary variables x and t, i.e., Et,x[h(X(T))] = g(t, x) for fixed T. The Markov property is expressible as E[h(X(T)) |ℱt] = Et,X(t)[h(X(T))] = g(t, X(t)), for all 0 ⩽ t ⩽ T. Hence, if we know the conditional probability distribution of the random vector X(T), given X(t) = x, then we can compute the function g(t, x).

As in the scalar case, the conditional PDF of X(T), given X(t) is the (joint) transition PDF, p(t, T; x, y) ≡ p(t, T; x1,...,xn,y1,...,yn), for the vector process X obtained by differentiating the corresponding (joint) transition CDF, P(t, T; x, y):

$p (t, T; x, y) = \frac{\partial^{n}}{\partial y_{1 \dots} \partial y_{n}} P (t, T; x, y), (11.168)$

$\begin{array}{l} P (t, T; x, y) & : = ℙ (X (T) ⩽ y | X (t) = x) \\ \equiv ℙ (X_{1} (T) ⩽ y_{1}, ..., X_{n} (T) ⩽ y_{n} | X_{1} (t) = x_{1}, ..., X_{n} (t) = x_{n}) \\ = \int_{- \infty}^{y_{1}} \dots \int_{- \infty}^{y_{n}} p (t, T; x, z) d z . \end{array} (11.169)$

As in the one-dimensional case, the Markov and tower property lead to the multidimensional version of the Chapman–Kolmogorov relation:

$p (s, t; x, y) = \int_{ℝ^{n}} p (s, u; x, z) p (u, t; z, y) d z, s < u < t . (11.170)$

Given any Borel set $B \in ℬ (ℝ^{n})$ , the probability that the time-t vector process has value in B, given that it has value $x \in ℝ^{n}$ at some earlier time s < t, is given by integrating the transition PDF over B:

$P (X (t) \in B | X (s) = x) = \int_{B} p (s, t; x, y) d y . (11.171)$

The multidimensional analogue of (11.41) for computing a conditional expectation is an integral over $ℝ^{n}$ :

$E_{t, x} [h (X (T))] = \int_{ℝ^{n}} h (y) p (t, T; x, y) d y, t < T . (11.172)$

As in the case of a scalar diffusion on ℝ with the generator in (11.43), the generator for the above vector diffusion process {X(t)}t⩾0 on $ℝ^{n}$ is defined by the differential operator $G_{t, x}$ acting on a smooth function f = f(t, x),

$G_{t, x} f : = \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} C_{i j} (t, x) \frac{\partial^{2} f}{\partial x_{i} \partial x_{j}} + \sum_{i = 1}^{n} μ_{i} (t, x) \frac{\partial f}{\partial x_{i}} (11.173)$

where ${[C_{i j} (t, x) = \sum_{k = 1}^{d} σ_{i k} (t, x) σ_{j k} (t, x)]}_{i, j = 1, ..., n}$ is the diffusion matrix. We shall assume that this matrix is positive definite where $v^{T} Cv > 0$ for nonzero $v \in ℝ^{n}$ . The differential and integral forms of the Itô formula are now written compactly (extending (11.44) and (11.45) to the multidimensional case):

$d f (t, X (t)) = (\frac{\partial}{\partial t} + G_{t, x}) f (t, X (t)) d t + \sum_{j = 1}^{d} (\sum_{i = 1}^{n} \frac{\partial f}{\partial x_{i}} (t, X (t)) σ_{i j} (t, X (t))) d W_{j} (t) (11.174)$

and

$\begin{array}{l} f (t, X (t)) & = f (0, X (0)) + \int_{0}^{t} (\frac{\partial}{\partial s} + G_{s, x}) f (s, X (s)) d s \\ + \sum_{j = 1}^{d} \int_{0}^{t} (\sum_{i = 1}^{n} \frac{\partial f}{\partial x_{i}} (s, X (s)) σ_{i j} (s, X (s))) d W_{j} (s) . \end{array} (11.175)$

The analogues of (11.46)–(11.48) also follow if we fix a time T > 0 and assume the square-integrability condition,

$\int_{0}^{T} E [{(\sum_{i = 1}^{n} \frac{\partial f}{\partial x_{i}} (s, X (s)) σ_{i j} (s, X (s)))}^{2}] d s < \infty, j = 1, ..., n, (11.176)$

which ensures that all Itô integrals (w.r.t. each BM, Wj) in (11.175) are martingales. By using a similar argument as in the one-dimensional case, the process {Mf (t)}0⩽t⩽T defined by

$M_{f} (t) : = f (t, X (t)) - \int_{0}^{t} (\frac{\partial}{\partial s} + G_{s, x}) f (s, X (s)) d s (11.177)$

is a martingale, i.e., (11.48) holds. As a particular application of (11.177), we obtain a martingale defined by Mf (t) := f(t, X(t)) if the function f solves the PDE: $\frac{\partial f}{\partial t} + G_{t, x} f = 0$ .

The martingale property of the process defined in (11.177) allows us to extend Theorem 11.7, Theorem 11.8 and Theorem 11.9 to the multidimensional case. Here, we simply state useful versions of the multidimensional extensions. Their proofs involve some similar steps as in the one-dimensional case.

Theorem 11.19.

(Multidimensional Feynman–Kac). Let {X(t) := (X1 (t),...,Xn(t))}0⩽t⩽T solve the system of SDEs in (11.164) and let $ϕ : ℝ^{n} \to ℝ$ be a Borel function. Also, assume the square-integrability condition (11.176) holds. Suppose the function f(t, x) is a solution to the backward Kolmogorov PDE $\frac{\partial f}{\partial t} + G_{t, x} f = 0$ , i.e.,

$\frac{\partial f}{\partial t} (t, x) + \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} C_{i j} (t, x) \frac{\partial^{2} f}{\partial x_{i} \partial x_{j}} (t, x) + \sum_{i = 1}^{n} μ_{i} (t, x) \frac{\partial f}{\partial x_{i}} (t, x) = 0, (11.178)$

for all $x \in ℝ^{n}$ , t < T , subject to the terminal condition f(T, x) = ϕ(x). Then, assuming $E_{t, x} [| ϕ (X (T)) |] < \infty, f (t, x)$ has the representation

$f (t, x) = E_{t, x} [ϕ (X (T))] \equiv E [ϕ (X (T)) | X (t) = x] (11.179)$

for all $x \in ℝ^{n}$ , 0 ⩽ t ⩽ T.

The slightly more general result below includes an additional exponential discount factor via a discounting function r(t, x). Theorem 11.19 is then a particular case of this theorem by simply setting r(t, x) ≡ 0.

Theorem 11.20.

(“Discounted” Feynman–Kac). Let {X(t) := (X1 (t),...,Xn(t))}0⩽t⩽T solve the system of SDEs in (11.164) and assume the square integrability condition (11.176) holds. Let $ϕ : ℝ^{n} \to ℝ$ be a Borel function and $r (t, x) : [0, T] \times ℝ^{n} \to ℝ$ be a lower-bounded continuous function. Then, the function defined by the conditional expectation

$f (t, x) : = E_{t, x} [e^{- \int_{t}^{T} r (u, X (u)) d u} ϕ (X (T))] \equiv E [e^{- \int_{t}^{T} r (u, X (u)) d u} ϕ (X (T)) | X (t) = x] (11.180)$

solves the $P D E \frac{\partial f}{\partial t} + G_{t, x} f - r (t, x) f = 0$ , i.e.,

for all x, 0 < t < T, subject to the terminal condition f(T, x) = ϕ(x).

In the special case that the discount function is a constant, r(t, x) ≡ r, then (11.180) simplifies since the discount factor is simply $e^{- r (T - t)}$ where $f (t, x) = e^{- r (T - t)} E_{t, x} [ϕ (X (T))]$ .

The multidimensional version of Proposition 11.8 also follows where both the transition PDF and CDF solve the backward Kolmogorov PDE in (11.178) in the (backward) variables (t, x). In particular, fixing a time T > 0 and a vector $y \in ℝ^{n}$ , and setting $ϕ (x) = I_{{x_{1} ⩽ y_{1}, ..., x_{n} ⩽ y_{n}}}$ in Proposition 11.19 implies that the transition CDF

$P (t, T; x, y) \equiv E [I_{{X_{1} (T) ⩽ y_{1}, ..., X_{n} (T) ⩽ y_{n}}} | X_{1} (t) = x_{1}, ..., X_{n} (t) = x_{n}]$

solves the PDE in (11.178) with terminal condition as the indicator function, $P (T, T; x, y) = ℙ (x_{1} ⩽ y_{1}, ..., x_{n} ⩽ y_{n}) = I_{{x_{1} ⩽ y_{1}, ..., x_{n} ⩽ y_{n}}}$ . The transition PDF, p = p(t, T; x, y), is obtained from the CDF by differentiating in the y variables, according to (11.168). Hence, p also solves (11.178) and the terminal condition is given by

$\lim_{t ↗ T} p (t . T; x, y) = δ (y - x)$

where δ(y − x) = δ(y1 − x1) ... δ(yn − xn) is the n-dimensional Dirac delta function as a product of univariate delta functions.

The transition PDF p(t, T; x, y) is the conditional PDF of the random vector X(T) at y, given X(t) = x. Hence, according to (11.179) the solution to the PDE problem takes the form of an integral of the product of the transition PDF and the function ϕ(y):

$f (t, x) = E [ϕ (X (T)) | X (t) = x] = \int_{ℝ^{n}} ϕ (y) p (t, T; x, y) d y . (11.182)$

That is, the transition PDF is the fundamental solution to the PDE problem stated in Theorem 11.19.

For many applications the vector diffusion process is time homogeneous where the drift and diffusion matrix are time independent, μi(t, x) ≡ μi(x) and $C_{i j} (t, x) = C_{i j} (x) = \sum_{k = 1}^{d} σ_{i k} (x) σ_{j k} (x)$ . The generator is then a differential operator only in the x variables,

$G_{t, x} = G_{x} : = \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} C_{i j} (x) \frac{\partial^{2}}{\partial x_{i} \partial x_{j}} + \sum_{i = 1}^{n} μ_{i} (x) \frac{\partial}{\partial x_{i}} .$

Defining τ := T − t, the solution in (11.179) is a function of (τ, x), i.e., f = f(τ, x), and the backward PDE in (11.178) takes the form

$\frac{\partial f}{\partial τ} = \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} C_{i j} (x) \frac{\partial^{2} f}{\partial x_{i} \partial x_{j}} + \sum_{i = 1}^{n} μ_{i} (x) \frac{\partial f}{\partial x_{i}} (11.183)$

subject to the initial condition f(0, x) = ϕ(x). Moreover, if the discount function in Theorem 11.20 is time independent, r(t, x) ≡ r(x), then the operator $G_{t, x} - r (t, x) \equiv G_{x} - r (x)$ is time independent, i.e., the PDE in (11.181) is time-homogeneous:

with the solution represented in (11.180) as a function f = f(τ, x) having initial condition f(0, x) = ϕ(x).

For the time-homogeneous case we hence have the transition CDF and PDF as functions of τ, x, y where we equivalently write P(t, T; x, y) as P(τ; x, y) and p(t, T; x, y) as p(τ; x, y).

Both P(τ; x, y) and p(τ; x, y) solve the PDE in (11.183) where p(0+; x, y) = δ(x − y) and P(0; x, y) given by the n-dimensional unit step function $I_{{y ⩾ x}} = I_{{y_{1} ⩾ x_{1}, ..., y_{n} ⩾ x_{n}}}$ .

As a first example of how Theorem 11.19 can be applied in practice, consider a simple 2-dimensional process (n = 2) driven by a 2-dimensional BM (d = 2). That is, let ${X (t) = {[X_{1} (t), X_{2} (t)]}^{T}}_{t ⩾ 0} \in ℝ^{2}$ be two scaled and drifted Brownian motions satisfying the system of SDEs:

$\begin{array}{l} d X_{1} (t) & = μ_{1} d t + σ_{1} d W_{1} (t) \equiv μ_{1} d t + σ_{1} \cdot d W (t), \\ d X_{2} (t) & = μ_{2} d t + σ_{2} ρ d W_{1} (t) + σ_{2} \sqrt{1 - ρ^{2}} d W_{2} (t) \equiv μ_{2} d t + σ_{2} \cdot d W (t), \end{array} (11.185)$

where ρ ∊ (−1, 1) is a constant correlation coefficient, σ1 = [σ1, 0] and $σ_{2} = [σ_{2} ρ, σ_{2} \sqrt{1 - ρ^{2}}]$ are volatility vectors with magnitudes $|| σ_{1} || = σ_{1}, || σ_{2} || = σ_{2}$ . Note that σ1·σ2 = ρσ1σ2. We can also represent this system of SDEs in vector-matrix notation, dX(t) = μ dt + σ dW(t):

$[\begin{matrix} d X_{1} (t) \\ d X_{2} (t) \end{matrix}] = [\begin{matrix} μ_{1} \\ μ_{2} \end{matrix}] d t + [\begin{matrix} σ_{1} & 0 \\ σ_{2} ρ & σ_{2} \sqrt{1 - ρ^{2}} \end{matrix}] [\begin{matrix} d W_{1} (t) \\ d W_{2} (t) \end{matrix}] .$

The above 2 × 2 is the σ-matrix whose rows correspond to the volatility vectors σ1 and σ2. The diffusion matrix C = σσT is then

$C = [\begin{matrix} σ_{1} & 0 \\ σ_{2} ρ & σ_{2} \sqrt{1 - ρ^{2}} \end{matrix}] [\begin{matrix} σ_{1} & σ_{2} ρ \\ 0 & σ_{2} \sqrt{1 - ρ^{2}} \end{matrix}] = [\begin{matrix} σ_{1}^{2} & ρ σ_{1} σ_{2} \\ ρ σ_{1} σ_{2} & σ_{2}^{2} \end{matrix}],$

where the 2 × 2 matrix σ is the lower Cholesky factorization of C.

The SDEs in (11.185), subject to arbitrary initial conditions X1(t) = x1, X2(t) = x2, are solved by simply integrating from time t to T:

$\begin{array}{l} X_{1} (T) = x_{1} + μ_{1} (T - t) + σ_{1} \cdot (W (T) - W (t)), \\ X_{2} (T) = x_{2} + μ_{2} (T - t) + σ_{2} \cdot (W (T) - W (t)) . \end{array} (11.186)$

We can express these random variables in terms of two standard normal random variables:

$\begin{array}{l} X_{1} (T) = x_{1} + μ_{1} τ + σ_{1} \sqrt{τ} Z_{1}, \\ X_{2} (T) = x_{2} + μ_{2} τ + σ_{2} \sqrt{τ} Z_{2} . \end{array} (11.187)$

where

$Z_{1} : = \frac{σ_{1} \cdot (W (T) - W (t))}{σ_{1} \sqrt{τ}}, Z_{2} : = \frac{σ_{2} \cdot (W (T) - W (t))}{σ_{2} \sqrt{τ}}, (11.188)$

τ := T − t. The correlation (and covariance) between these two standard normals equals ρ. This follows from the fact that Wi(T) − Wi(t), i = 1, 2, are i.i.d. Norm(0,τ):

$\begin{array}{l} Cov (Z_{1}, Z_{2}) & = \frac{1}{σ_{1} σ_{2} τ} Cov (σ_{1} \cdot (W (T) - W (t)), σ_{2} \cdot (W (T) - W (t))) \\ = \frac{1}{σ_{1} σ_{2} τ} \sum_{i = 1}^{2} \sum_{j = 1}^{2} σ_{1 i} σ_{2 j} Cov (W_{i} (T) - W_{i} (t), W_{j} (T) - W_{j} (t)) \\ = \frac{1}{σ_{1} σ_{2} τ} \sum_{i = 1}^{2} (σ_{1 i} σ_{2 i}) τ = \frac{σ_{1} \cdot σ_{2}}{σ_{1} σ_{2}} = ρ . \end{array}$

Hence, by (11.187), the matrix of correlations, ρij, of the component processes is the 2 × 2 correlation matrix given by ρ11 = ρ22 = 1, ρ12 = ρ21 = ρ, i.e.,

$ρ_{12} : = Corr (X_{1} (T), X_{2} (T)) = \frac{Cov ((X_{1} (T), X_{2} (T))}{\sqrt{Var (X_{1} (T)) Var (X_{2} (T))}} = Cov (Z_{1}, Z_{2}) = ρ .$

The covariance matrix of X(T) is given by Cov(Xi(T), Xj(T)) = Cijτ = (σi · σj)τ = ρijσiσj τ; i, j = 1, 2. The time-scaled solution vector $\frac{1}{\sqrt{τ}} X (T)$ is a bivariate normal,

${[\frac{X_{1} (T)}{\sqrt{τ}}, \frac{X_{2} (T)}{\sqrt{τ}}]}^{Τ} ~ N o r m_{2} ({[\frac{x_{1} + μ_{1} τ}{\sqrt{τ}}, \frac{x_{2} + μ_{2} τ}{\sqrt{τ}}]}^{Τ}, C) .$

The time-homogeneous transition CDF for the vector process is obtained by computing a joint conditional probability while using (11.186), or (11.187), and the fact that W(T) − W(t) is independent of X(t), i.e., the pair Z1, Z2 is independent of the pair X1(t), X2(t):

$\begin{array}{l} P (τ; x_{1}, x_{2}, y_{1}, y_{2}) & : = ℙ (X_{1} (T) ⩽ y_{1}, X_{2} (T) ⩽ y_{2} | X_{1} (t) = x_{1}, X_{2} (t) = x_{2}) \\ = ℙ (x_{1} + μ_{1} τ + σ_{1} \sqrt{τ} Z_{1} ⩽ y_{1}, x_{2} + μ_{2} τ + σ_{2} \sqrt{τ} Z_{2} ⩽ y_{2}) \\ = ℙ (Z_{1} ⩽ \frac{y_{1} - x_{1} - μ_{1} τ}{σ_{1} \sqrt{τ}}, Z_{2} ⩽ \frac{y_{2} - x_{2} - μ_{2} τ}{σ_{2} \sqrt{τ}}) \\ = N_{2} (\frac{y_{1} - x_{1} - μ_{1} τ}{σ_{1} \sqrt{τ}}, \frac{y_{2} - x_{2} - μ_{2} τ}{σ_{2} \sqrt{τ}}; ρ) . \end{array} (11.189)$

This is a bivariate normal CDF and differentiating (using the chain rule) gives the transition PDF as a bivariate normal density,

$\begin{array}{l} p (τ; x_{1}, x_{2}, y_{1}, y_{2}) & \equiv \frac{\partial^{2}}{\partial y_{1} \partial y_{2}} P (τ; x_{1}, x_{2}, y_{1}, y_{2}) \\ = \frac{1}{σ_{1} σ_{2} τ} n_{2} (\frac{y_{1} - x_{1} - μ_{1} τ}{σ_{1} \sqrt{τ}}, \frac{y_{2} - x_{2} - μ_{2} τ}{σ_{2} \sqrt{τ}}; ρ) \\ = \frac{1}{2 π τ σ_{1} σ_{2} \sqrt{1 - ρ^{2}}} \exp (- \frac{z_{1}^{2} + z_{2}^{2} - 2 ρ z_{1} z_{2}}{2 (1 - ρ^{2})}), \end{array} (11.190)$

where $z_{1} = \frac{y_{1} - x_{1} - μ_{1} τ}{σ_{1} \sqrt{τ}}$ , $z_{2} = \frac{y_{2} - x_{2} - μ_{2} τ}{σ_{2} \sqrt{τ}}$ . According to Theorem 11.19, both transition CDF and PDF, P and p, solve the time-homogeneous PDE in (11.183) in the variables τ, x1, x2, for fixed arbitrary real values of y1, y2. Using the above explicit constant expressions for $C_{11} = σ_{1}^{2}$ , $C_{22} = σ_{2}^{2}$ , C12 = C21 = ρσ1σ2, and the constant drift coefficients μ1 and μ2, p and P solve the backward PDE, i.e.,

$\frac{\partial p}{\partial τ} = \frac{1}{2} σ_{1}^{2} \frac{\partial^{2} p}{\partial x_{1}^{2}} + \frac{1}{2} σ_{2}^{2} \frac{\partial^{2} p}{\partial x_{2}^{2}} + ρ σ_{1} σ_{2} \frac{\partial^{2} p}{\partial x_{1} \partial x_{2}} + μ_{1} \frac{\partial p}{\partial x_{1}} + μ_{2} \frac{\partial p}{\partial x_{2}}$

and similarly for P. We leave it as an exercise for the reader to show that the transition CDF in (11.190) has limit $P (0 +; x_{1}, x_{2}, y_{1}, y_{2}) = I_{{y_{1} ⩾ x_{1}, y_{2} ⩾ x_{2}}}$ for all x ≠ y. The Dirac delta function initial condition for the transition PDF then follows by formally differentiating the step function to obtain p(0+; x1, x2, y1, y2) = δ(y1 − x1)δ(y2 − x2). The reader can verify by direct differentiation that the transition PDF in (11.190) satisfies the above PDE and is hence the fundamental solution. As a density in the variables y1, y2, the transition PDF should also integrate to unity over $ℝ^{2}$ . That is, the event ${(X_{1} (T), X_{2} (T)) \in ℝ^{2}}$ , conditional on X1(t) = x1, X2(t) = x2, must have unit probability. This is directly verified as follows:

$\begin{array}{l} ℙ (X_{1} (T) < \infty, X_{2} (T) < \infty | X_{1} (t) = x_{1}, X_{2} (t) = x_{2}) \\ = \lim_{y_{1} \to \infty, y_{2} \to \infty} P (τ; x_{1}, x_{2} y_{1}, y_{2}) \\ = \lim_{y_{1} \to \infty, y_{2} \to \infty} N_{2} (\frac{y_{1} - x_{1} - μ_{1} τ}{σ_{1} \sqrt{τ}}, \frac{y_{2} - x_{2} - μ_{2} τ}{σ_{2} \sqrt{τ}}; ρ) \\ = N_{2} (\infty, \infty; ρ) = 1. \end{array}$

From (11.169), this implies that the PDF integrates to unity for all $x_{1}, x_{2} \in ℝ$ ,

$\int_{- \infty}^{\infty} \int_{- \infty}^{\infty} p (τ; x_{1}, x_{2}, y_{1}, y_{2}) d y_{1} d y_{2} = 1.$

Alternatively, this is easily shown by directly integrating the bivariate density in (11.190). Note that in the special case when ρ = 0, the two processes are independent (uncorrelated) drifted and scaled BM. The joint transition PDF and CDF are simply products of the one-dimensional PDFs and CDFs of the component processes. This is consistent with the fact that the above PDE is separable in the variables x1 and x2 and hence admits a solution as a product of individual functions of x1 and x2.

Let us now consider a multidimensional GBM process. This is an important example that arises in Chapter 13 where we consider derivative pricing within a standard economic model containing multiple stocks whose price processes are correlated geometric Brownian motions. In particular, consider n ⩾ 1 strictly positive stock price processes S(t) := (S1(t),...,Sn(t)), t ⩾ 0, that are driven by a standard d ⩾ 1 dimensional BM, W(t) = (W1(t), ..., Wd(t)):

$\begin{array}{l} d S_{i} (t) & = S_{i} (t) [μ_{i} d t + \sum_{j = 1}^{d} σ_{i j} d W_{j} (t)] \\ \equiv S_{i} (t) [μ_{i} d t + σ \cdot d W (t)], i = 1, ..., n, \end{array} (11.191)$

where the log-drifts μi and log-volatilities σij are assumed to be constant parameters. It is important to stress the distinction that these are log-coefficients, although by standard convention we are still using similar symbols for the drift and volatility coefficient functions! To be precise, by identifying the SDE in (11.191) with that in (11.164) (where X(t) → S(t)) we see that the drift and volatility coefficient functions for the i-th stock price in (11.191) are state-dependent (time-independent) linear functions of the i-th variable

$μ_{i} (t, x) \equiv μ_{i} (x) = μ_{i} x_{i} and σ_{i j} (t, x) \equiv σ_{i j} (x) = σ_{i j} x_{i} (11.192)$

where on the right of each equality are the log-drift and log-volatility parameters μi and σij. [We note that the symbols μi and σij are constant parameters when denoted without arguments and they are the drift and volatility coefficient functions when denoted with arguments.] So the above SDEs are time homogeneous with linear functions μi(t, S(t)) ≡ μi(S(t)) = μiSi(t) and σij(t, S(t)) ≡ σij (S(t)) = Si(t)σij.

The log-volatility coefficient matrix $σ = {[σ_{i j}]}_{i = 1, ..., n; j = 1, ..., d}$ is an n × d constant matrix with the i-th row being the 1 × d volatility vector σi = [σi1 ..., σid] for the i-th stock price process. The system of SDEs in (11.191) has matrix-vector form:

$[\begin{matrix} \frac{d S_{1} (t)}{S_{1} (t)} \\ ⋮ \\ \frac{d S_{n} (t)}{S_{n} (t)} \end{matrix}] = [\begin{matrix} μ_{1} \\ ⋮ \\ μ_{n} \end{matrix}] d t + [\begin{matrix} σ_{11} & \dots & σ_{1 d} \\ ⋮ & ⋮ \\ σ_{n 1} & \dots & σ_{n d} \end{matrix}] [\begin{matrix} d W_{1} (t) \\ ⋮ \\ d W_{d} (t) \end{matrix}] . (11.193)$

As shown just below, the log-diffusion matrix C = σσT is proportional to the n × n matrix of covariances among the log-returns of the stocks. We assume that C is nonsingular. As usual, we define the n × n matrix of correlations, $ρ : = {[ρ_{i j}]}_{i, j = 1, ..., n}$ where Cij = σi ·σj = ρijσiσj , where the i-th volatility vector has magnitude denoted by σi > 0,

$C_{i i} = σ_{i}^{2} \equiv {|| σ_{i} ||}^{2} = σ_{i 1}^{2} + ... + σ_{i d}^{2} .$

The system in (11.191) is readily solved by considering the log-prices defined by Xi(t) := ln Si(t). In fact, we have already solved this problem. See (11.148)–(11.150), where each SDE in (11.191) is of the form in (11.148) with solution of the form in (11.149). For the sake of clarity, we repeat the same steps here by using Itô’s formula where (upon substituting the expression in (11.191)):

$\begin{array}{l} d X_{i} (t) = d \ln S_{i} (t) = \frac{d S_{i} (t)}{S_{i} (t)} - \frac{1}{2} {(\frac{d S_{i} (t)}{S_{i} (t)})}^{2} & = (μ_{i} - \frac{1}{2} {|| σ_{i} ||}^{2}) d t + σ_{i} \cdot d W (t) \\ = (μ_{i} - \frac{1}{2} σ_{i}^{2}) d t + σ_{i} \cdot d W (t) \end{array}$

with initial condition Xi(0) = ln Si(0), where Si(0), i = 1, . . . , n, are the initial stock prices. Integrating and exponentiating gives the stock prices $S_{i} (t) = e^{X_{i} (t)}$ for all t ⩾ 0:

$S_{i} (t) = S_{i} (0) e^{(μ_{i} - \frac{1}{2} σ_{i}^{2}) t + σ_{i} \cdot W (t)} = S_{i} (0) e^{μ_{i} t} ε_{t} (σ_{i} \cdot W) . (11.194)$

It is easy to verify by computing the stochastic differential of this expression, upon directly applying Itô’s formula, that each Si(t) solves (11.191). The solution in (11.194) is in fact the unique strong solution subject to the initial price vector [S1(0),...,Sn(0)]T.

The second expression in (11.194) involves an exponential ℙ-martingale,

$ε_{t} (σ_{i} \cdot W) = \exp [- \frac{1}{2} σ_{i}^{2} t + σ_{i} \cdot W (t)] = \exp [- \frac{1}{2} σ_{i}^{2} t + \sum_{j = 1}^{d} σ_{i j} W_{j} (t)] .$

To see that this is a ℙ-martingale with expectation one, note that σi · W(t) is normal with mean $E [σ_{i} \cdot W (t)] = \sum_{j = 1}^{d} σ_{i j} E [W_{j} (t)] = 0$ and variance

$Var (σ_{i} \cdot W (t)) = Var (\sum_{j = 1}^{d} σ_{i j} W_{j} (t)) = \sum_{j = 1}^{d} σ_{i j}^{2} Var (W_{j} (t)) = {|| σ_{i} ||}^{2} t = σ_{i}^{2} t .$

Here we used the fact that all Wj(t) BMs are i.i.d. Norm(0, t). Hence, by the expression for the m.g.f. of a normal random variable, $E [e^{α σ_{i} \cdot W (t)}] = e^{\frac{1}{2} α^{2} α_{i}^{2} t}$ for any $α, i . e ., E [e^{σ_{i} \cdot W (t)}] = e^{\frac{1}{2} α_{i}^{2} t}$ and $E [ε_{t} (σ_{i} \cdot W)] = 1$ , for all t ⩾ 0. So the mean of the price in (11.194) is

$E [S_{i} (t)] = S_{i} (0) e^{μ_{i} t} E [ε_{t} (σ_{i} \cdot W)] = S_{i} (0) e^{μ_{i} t} . (11.195)$

As in the one-dimensional GBM stock model, the log-normal drift parameter μi is therefore the (physical) growth rate of the i-th price process in the (physical) measure ℙ.

From the strong solution in (11.194) we have

$S_{i} (T) = S_{i} (t) e^{(μ_{i} - \frac{1}{2} σ_{i}^{2}) (T - t) + σ_{i} \cdot (W (T) - W (t))} . (11.196)$

It is convenient to define the log-return random variables over a time interval τ := T − t,

$X_{i} : = \ln \frac{S_{i} (T)}{S_{i} (t)} = σ_{i} + σ_{i} \sqrt{τ} Z_{i}, (11.197)$

$α_{i} \equiv (μ_{i} - \frac{1}{2} σ_{i}^{2}) τ$ , where

$Z_{i} : = \frac{σ_{i} \cdot (W (T) - W (t))}{σ_{i} \sqrt{τ}} = \frac{1}{σ_{i} \sqrt{τ}} \sum_{j = 1}^{d} σ_{i j} (W_{j} (T) - W_{j} (t)),$

i = 1, ..., n. The Zi’s are normal random variables since they are linear combinations of Brownian increments. The vector Z = [Z1,...,Zn]T has multivariate normal distribution:

${[Z_{1}, ..., Z_{n}]}^{Τ} ~ N o r m_{n} (0, ρ) . (11.198)$

To verify this, the covariances are computed using the same steps as in the calculation of the covariance of Z1 and Z2 in (11.188):

$\begin{array}{l} Cov (Z_{i}, Z_{j}) & = \frac{1}{σ_{i} \sqrt{τ}} \frac{1}{σ_{j} \sqrt{τ}} Cov (σ_{i} \cdot (W (T) - W (t)), σ_{j} \cdot (W (T) - W (t))) \\ = \frac{1}{σ_{i} \sqrt{τ}} \frac{1}{σ_{j} \sqrt{τ}} σ_{i} \cdot σ_{j} τ = \frac{σ_{i} \cdot σ_{j}}{σ_{i} σ_{j}} = ρ_{i j} . \end{array} (11.199)$

Hence, Cov(Xi, Xj) = τσiσj Cov(Zi,Zj) = τσiσjρij = τCij. The log-returns are therefore jointly normally distributed:

${[X_{1}, ..., X_{n}]}^{Τ} ~ N o r m_{n} ({[α_{1}, ..., α_{n}]}^{Τ}, τ C) . (11.200)$

In particular, the matrix ρ is in fact the matrix of correlations of the stock price log-returns:

$Corr (X_{i}, X_{j}) = \frac{Cov (X_{i}, X_{j})}{\sqrt{Var (X_{i}) Var (X_{j})}} = \frac{τ C_{i j}}{σ_{i} σ_{j}} = ρ_{i j} . (11.201)$

The above GBM process is time homogeneous. Let x = (x1, ..., xn) and y = (y1, ..., yn) be strictly positive vectors in $ℝ_{+}^{n}$ . The (joint) transition CDF of the stock price process S(t) is given by the conditional probability below which is calculated by using independence among all log-returns ${X_{i} \equiv \frac{S_{i} (T)}{S_{i} (t)}}_{i = 1, ..., n}$ and time-t stock prices {Si(t)}i = 1, ..., n:

$\begin{array}{l} P (τ; x, y) & : = ℙ (S_{1} (T) ⩽ y_{1}, ..., S_{n} (T) ⩽ y_{n} | S_{1} (t) = x_{1}, ..., S_{n} (t) = x_{n}) \\ = ℙ (\ln \frac{S_{1} (T)}{S_{1} (t)} ⩽ \ln \frac{y_{1}}{x_{1}}, ..., \ln \frac{S_{n} (T)}{S_{n} (t)} ⩽ \ln \frac{y_{n}}{x_{n}} | S_{1} (t) = x_{1}, ..., S_{n} (t) = x_{n}) \\ = ℙ (X_{1} ⩽ \ln \frac{y_{1}}{x_{1}}, ..., X_{n} ⩽ \ln \frac{y_{n}}{x_{n}}) \\ = ℙ (Z_{1} ⩽ a_{1}, ..., Z_{n} ⩽ a_{n}) \\ = N_{n} (a_{1} ..., a_{n}; ρ) \end{array} (11.202)$

where $α_{i} \equiv \frac{\ln \frac{y_{i}}{x_{i}} - α_{i}}{σ_{i} \sqrt{τ}} = \frac{\ln \frac{y_{i}}{x_{i}} - (μ_{i} - σ_{i}^{2} / 2) τ}{σ_{i} \sqrt{τ}}$ . The function Nn(a1, ..., an; ρ) is the n-variate standard normal CDF of Z with given correlation matrix ρ:

$N_{n} (a_{1}, ..., a_{n}; ρ) = \int_{- \infty}^{a_{n}} \dots \int_{- \infty}^{a_{1}} n_{n} (z_{1}, ..., z_{n}; ρ) d z_{1} ... d z_{n} .$

The standard normal PDF of Z, $n_{n} (z_{1}, ..., z_{n}; ρ) = \frac{\partial^{n}}{\partial z_{1 \dots} \partial z_{n}} N_{n} (z_{1} ..., z_{n}; ρ)$ , is given by the n-variate Gaussian density

$n_{n} (z_{1}, ..., z_{n}; ρ) = \frac{1}{\sqrt{{(2 π)}^{n} \det ρ}} \exp (- \frac{1}{2} z \cdot ρ^{- 1} \cdot z^{Τ}), (11.203)$

$z = [z_{1}, ..., z_{n}] \in ℝ^{n}$ . Differentiating according to (11.168), and applying the chain rule, gives the (joint) transition PDF for the time-homogeneous GBM stock price process:

$\begin{array}{l} p (τ; x,y) & = \frac{\partial^{n}}{\partial y_{1} ... \partial y_{n}} P (τ; x, y) = (\prod_{i = 1}^{n} \frac{\partial a_{i}}{\partial y_{i}}) \frac{\partial^{n}}{\partial a_{1} ... \partial a_{n}} N_{n} (a_{1} ..., a_{n}; ρ) \\ = (\prod_{i = 1}^{n} \frac{1}{y_{i} σ_{i} \sqrt{τ}}) n_{n} (a_{1} ..., a_{n}; ρ) \\ = \frac{1}{y_{1} ... y_{n} σ_{1} ... σ_{n} \sqrt{{(2 π τ)}^{n} \det ρ}} \exp (- \frac{1}{2} a \cdot ρ^{- 1} \cdot a^{Τ}), \end{array} (11.204)$

a = [a1, ..., an] and ai’s defined above. This is of the form of a multivariate log-normal density. Note that this density can also be written in terms of the covariance matrix C = DρD, D = diag(σ1, ..., σn), ρ−1 = DC−1D.

By (11.192), the diffusion matrix function has elements

$C_{i j} (x) = \sum_{k = 1}^{d} σ_{i k} (x) σ_{j k} (x) = \sum_{k = 1}^{d} x_{i} σ_{i k} x_{j} σ_{j k} = x_{i} x_{j} \sum_{k = 1}^{d} σ_{i k} σ_{j k} = x_{i} x_{j} C_{i j}$

with constants Cij = σi·σj = ρijσiσj. Hence, the time-homogeneous PDE in (11.183) takes the equivalent form:

$\begin{array}{l} \frac{\partial f}{\partial τ} & = \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} ρ_{i j} σ_{i} σ_{j} x_{i} x_{j} \frac{\partial^{2} f}{\partial x_{i} \partial x_{j}} + \sum_{i = 1}^{n} μ_{i} x_{i} \frac{\partial f}{\partial x_{i}} \\ = \frac{1}{2} \sum_{i = 1}^{n} σ_{i}^{2} x_{i}^{2} \frac{\partial^{2} f}{\partial x_{i}^{2}} + \sum_{j = 1}^{n} \sum_{i < j} ρ_{i j} σ_{i} σ_{j} x_{i} x_{j} \frac{\partial^{2} f}{\partial x_{i} \partial x_{j}} + \sum_{i = 1}^{n} μ_{i} x_{i} \frac{\partial f}{\partial x_{i}} . \end{array} (11.205)$

The Feynman–Kac Theorem 11.19 assures us that the transition CDF in (11.202) and PDF in (11.204) both solve the PDE (11.205) in the variables τ > 0, $x \in ℝ_{+}^{n}$ , for fixed $y \in ℝ_{+}^{n}$ . The initial condition $P (0; x, y) = I_{{x ⩽ y}} \equiv I_{{x_{1} ⩽ y_{1}, ..., x_{n} ⩽ y_{n}}}$ follows from the basic limit properties of the multivariate normal CDF. We leave it to the reader to verify. Then, by multiple differentiation of the step functions, the initial condition p(0+; x, y) = δ(x − y) is obtained for the transition PDF.

A common case is when n = d = 2. The above formulation simplifies since we have only one correlation coefficient ρ for the log-returns of stocks 1 and 2, where ρ12 = ρ21 ≡ ρ, ρ11 = ρ22 = 1. The log-diffusion matrix of covariances has elements C11 = σ12, C22 = σ22, C12 = C21 = ρσ1σ2. The log-volatility vectors are σ1 = [σ11, σ12]= [σ1, 0] and $σ_{2} = [σ_{21}, σ_{22}] = [ρ σ_{2}, σ_{2} \sqrt{1 - ρ^{2}}]$ . In this case, the system of SDEs in (11.191) simplifies for two stock prices driven by two BMs:

$\begin{array}{l} d S_{1} (t) & = S_{1} (t) [μ_{1} d t + σ_{1} d W_{1} (t)] \\ d S_{2} (t) & = S_{2} (t) [μ_{2} d t + σ_{2} ρ d W_{1} (t) + σ_{2} \sqrt{1 - ρ^{2}} d W_{2} (t)] . \end{array} (11.206)$

The unique solution is

$\begin{array}{l} S_{1} (t) & = S_{1} (0) e^{(μ_{1} - \frac{1}{2} σ_{1}^{2}) t + σ_{1} W_{1} (t)}, \\ S_{2} (t) & = S_{2} (0) e^{(μ_{2} - \frac{1}{2} σ_{2}^{2}) t + σ_{2} (ρ W_{1} (t) + \sqrt{1 - ρ^{2}} W_{2} (t))} . \end{array}$

The transition CDF and PDF obtain as a special case of (11.202) and (11.204) for n = 2:

$P (τ; x_{1}, x_{2}, y_{1}, y_{2}) = N_{2} (\frac{\ln \frac{y_{1}}{x_{1}} - (μ_{1} - \frac{1}{2} σ_{1}^{2}) τ}{σ_{1} \sqrt{τ}}, \frac{\ln \frac{y_{2}}{x_{2}} - (μ_{2} - \frac{1}{2} σ_{2}^{2}) τ}{σ_{2} \sqrt{τ}}; ρ) (11.207)$

and

$P (τ; x_{1}, x_{2}, y_{1}, y_{2}) = \frac{1}{y_{1} y_{2} σ_{1} σ_{2} τ} n_{2} (\frac{\ln \frac{y_{1}}{x_{1}} - (μ_{1} - \frac{1}{2} σ_{1}^{2}) τ}{σ_{1} \sqrt{τ}}, \frac{\ln \frac{y_{2}}{x_{2}} - (μ_{2} - \frac{1}{2} σ_{2}^{2}) τ}{σ_{2} \sqrt{τ}}; ρ) . (11.208)$

By the Feynman–Kac Theorem 11.19, these functions solve the time-homogeneous PDE in (11.205) for n = 2:

$\frac{\partial f}{\partial τ} = \frac{1}{2} σ_{1}^{2} x_{1}^{2} \frac{\partial^{2} f}{\partial x_{1}^{2}} + \frac{1}{2} σ_{2}^{2} x_{2}^{2} \frac{\partial^{2} f}{\partial x_{2}^{2}} + ρ σ_{1} σ_{2} x_{1} x_{2} \frac{\partial^{2} f}{\partial x_{1} \partial x_{2}} + μ_{1} x_{1} \frac{\partial f}{\partial x_{1}} + μ_{2} x_{2} \frac{\partial f}{\partial x_{2}} . (11.209)$

The reader can verify that the transition CDF has the limiting form $P (0 +; x_{1}, x_{2}, y_{1}, y_{2}) = I_{{y_{1} ⩾ x_{1}, y_{2} ⩾ x_{2}}}$ , for all x ≠ y, and p(0+; x1, x2, y1, y2) = δ(y1 − x1)δ(y2 − x2).

11.10.3 Girsanov’s Theorem for Multidimensional BM

We recall Girsanov’s Theorem 11.13 where the measure change was constructed in terms of a Radon–Nikodym process which has the form of an exponential martingale involving a single standard BM in the original measure P. Based on our knowledge of multidimensional BM and Itô integrals on multidimensional BM we can now consider the multidimensional version of Girsanov’s Theorem. The main ingredients are as in Theorem 11.13 where the single BM is now a multidimensional BM. As usual, we fix some filtered probability space $(Ω, ℱ, ℙ, F)$ , where $F = {ℱ_{t}}_{t ⩾ 0}$ is any filtration for standard Brownian motion.

Theorem 11.21

(Girsanov’s Theorem for Multidimensional BM). Fix a time T > 0 and let W(t) = (W1(t), ..., Wd(t)), 0 ⩽ t ⩽ T, be a standard d-dimensional ℙ-BM with respect to a filtration $F = {ℱ_{t}}_{0 ⩽ t ⩽ T}$ . Assume the vector process γ(t) = (γ1(t),...,γd(t)), 0 ⩽ t ⩽ T, is adapted to F such that

$E [\exp (\frac{1}{2} \int_{0}^{T} {|| γ (s) ||}^{2} d s)] < \infty . (11.210)$

Define

$ϱ_{t} : = \exp (- \frac{1}{2} \int_{0}^{t} {|| γ (s) ||}^{2} d s + \int_{0}^{t} γ (s) \cdot d W (s)), 0 ⩽ t ⩽ T, (11.211)$

and the probability measure $\hat{ℙ} \equiv {\hat{ℙ}}^{(ϱ)}$ by the Radon–Nikodym derivative $\frac{d \hat{ℙ}}{d ℙ} = {(\frac{d \hat{ℙ}}{d ℙ})}_{T} \equiv ϱ_{T}$ . Then, the process $\hat{W} (t) = ({\hat{W}}_{1} (t), ..., {\hat{W}}_{d} (t))$ , 0 ⩽ t ⩽ T, defined by

$\hat{W} (t) : = W (t) - \int_{0}^{t} γ (s) d s (11.212)$

is a standard d-dimensional $\hat{ℙ}$ -BM w.r.t. filtration $F$ .

We don’t provide the proof of this result here. We leave it as an exercise where one can apply similar steps as in (i)–(iii) in the proof of Theorem 11.13. In the multidimensional case we have d-dimensional BM and Lévy’s characterization in Theorem 11.16 can be applied.

The same remarks as were stated for Theorem 11.13 also apply to Theorem 11.21, where the adapted process is now the vector γ rather than the scalar γ. Of course, this multidimensional version generalizes Theorem 11.13, which obtains in the simplest case with d = 1. The Radon–Nikodym derivative process that defines the change of measure $ℙ \to \hat{ℙ}$ , such that the new d-dimensional BM, $\hat{W}$ , is a $\hat{ℙ} - BM$ , is given by the exponential ℙ-martingale in (11.211). It can be proven that the Novikov condition in (11.210) guarantees that the process ${ϱ_{t}}_{0 ⩽ t ⩽ T}$ is indeed a proper Radon–Nikodym derivative process, i.e., it is a ℙ-martingale with constant unit expectation, $E [ϱ_{t}] = 1$ for all t ∊ [0, T]. By Itô’s formula, the stochastic exponential in (11.211) is equivalent to the stochastic differential (using $ϱ_{0} = 1$ ):

$d ϱ_{t} = ϱ_{t} γ (t) \cdot d W (t) \Rightarrow ϱ_{t} = 1 + \int_{0}^{t} ϱ_{s} γ (s) \cdot d W (s) .$

The Novikov condition assures us that the Itô integral satisfies the integrability condition as in (11.121) and is therefore a ℙ-martingale with zero expectation. By the definition in (11.150) we have $ϱ_{t} \equiv ϱ_{t}^{(γ)} = ε_{t} (γ \cdot W)$ . Dividing the process value at any two times 0 ⩽ s < t ⩽ T gives

$\frac{ϱ_{t}}{ϱ_{s}} \equiv \frac{{(\frac{d \hat{ℙ}}{d ℙ})}_{t}}{{(\frac{d \hat{ℙ}}{d ℙ})}_{s}} = \frac{ε_{t} (γ \cdot W)}{ε_{s} (γ \cdot W)} = \exp (- \frac{1}{2} \int_{s}^{t} {|| γ (u) ||}^{2} d u + \int_{s}^{t} γ (u) \cdot d W (u)) .$

In general, γ is an adapted vector process so that $ϱ_{t}$ is a functional of d-dimensional BM from time 0 to t. By choosing a constant vector, γ(t) = γ, we have the simple and important special case where the Radon–Nikodym derivative process is also a GBM expressed equivalently as

$ϱ_{t} \equiv ϱ_{t}^{(γ)} = e^{- \frac{1}{2} {|| γ ||}^{2} t + γ \cdot W (t)} = e^{\frac{1}{2} {|| γ ||}^{2} t + γ \cdot \hat{W} (t)} (11.213)$

where $\hat{W} (t) \equiv {\hat{W}}^{(γ)} (t) : = W (t) - γ t, 0 ⩽ t ⩽ T$ , is a $\hat{ℙ} - BM$ . We recall that the random variable $γ \cdot W (t) ~ N o r m (0, {|| γ ||}^{2} t)$ . Hence, from its m.g.f. (under measure ℙ) we have $E [e^{γ \cdot W (t)}] = e^{\frac{1}{2} {|| γ ||}^{2} t}$ , i.e., $E [ϱ_{t}] = 1$ for all t ⩾ 0. This also follows trivially from the fact that the process is easily verified to be a ℙ-martingale and therefore must have constant expectation under measure ℙ, i.e., $E [ϱ_{t}] = E [ϱ_{0}] = E [1] = 1$ .

The multidimensional version of Girsanov’s Theorem 11.21 has many far-reaching applications. We now give one of its applications that is particularly important for financial derivative pricing theory. Namely, we shall use Girsanov’s Theorem to find a risk-neutral measure such that all stock prices S1 (t),...,Sn(t) in the multidimensional GBM model have a common drift rate equal to a constant r. Here, r is again the fixed (continuously compounded) interest rate. This problem is equivalent to applying Girsanov’s Theorem to construct an equivalent martingale measure $\hat{ℙ} \equiv \tilde{ℙ}$ , such that all discounted stock price processes defined by ${{\bar{S}}_{i} (t) : = e^{- r t} S_{i} (t)}_{t ⩾ 0,} i = 1, ..., n$ , are $\tilde{ℙ}$ -martingales. For the case of a single stock (n = 1) driven by one BM (d = 1), we have already solved this problem in Example 11.15.

For each i-th stock, the (log-)drift μi and all components of the (log-)volatility vector σi in (11.191) are constants. It follows that the measure change that we will need to employ uses (11.213), i.e., with constant d-dimensional vector γ =[γ1,...,γd]:

$ϱ_{t} \equiv {(\frac{d \tilde{ℙ}}{d ℙ})}_{t} = e^{- \frac{1}{2} {|| γ ||}^{2} t + γ \cdot W (t)} (11.214)$

with d-dimensional $\tilde{ℙ} - BM$ given by $\tilde{W} (t) \equiv {\tilde{W}}^{(γ)} (t) : = W (t) - γ t$ . In terms of stochastic differentials, $d \tilde{W} (t) = d W (t) - γ d t$ . A quick method of arriving at the change of measure is to consider the SDE satisfied by the stock prices (with respect to the physical ℙ-BM) in (11.191) and write $d W (t) = d \tilde{W} (t) + γ d t$ :

$\begin{array}{l} d S_{i} (t) & = S_{i} (t) [μ_{i} d t + σ_{i} \cdot (d \tilde{W} (t) + γ d t)] \\ = S_{i} (t) [(μ_{i} + σ_{i} \cdot γ) d t + σ_{i} \cdot d \tilde{W} (t)], i = 1, ..., n . \end{array} (11.215)$

Hence, a risk-neutral measure exists if we can find a vector γ such that the log-drift coefficient equals r for every i = 1,..., n. That is, we have

$d S_{i} (t) = S_{i} (t) [r d t + σ_{i} \cdot d \tilde{W} (t)], i = 1, ..., n, (11.216)$

which is equivalent to ${{\bar{S}}_{i} (t)}_{t ⩾ 0,} i = 1, ..., n$ , being $\tilde{ℙ}$ -martingales, if and only if γ solves μi + σi · γ = r, for each i = 1,..., n. This is a linear system of n-equations in d unknowns γ1,...,γd. Using the components of the (log-)volatility vectors, σi = [σi1,...,σid], we then have an n × d linear system

$[\begin{array}{l} σ_{11} & \dots & σ_{1 d} \\ ⋮ & ⋮ \\ σ_{n 1} & \dots & σ_{n d} \end{array}] [\begin{matrix} γ_{1} \\ ⋮ \\ γ_{d} \end{matrix}] = [\begin{matrix} r - μ_{1} \\ ⋮ \\ r - μ_{n} \end{matrix}] . (11.217)$

In compact notation this reads σγT = b, where σ is n × d, γT is d × 1, and b := r1 − μ is n × 1, where μ = [μ1,...,μn]T, 1 = [1,..., 1]T.

Hence, the question of the existence of a risk-neutral measure is answered quite simply by applying standard linear algebra. Generally a solution vector γ exists if and only if b is spanned by the d column vectors of σ. Here we should point out that we are seeking a solution for arbitrary physical drift vector $μ \in ℝ^{n}$ . A solution vector γ exists for any $b \in ℝ^{n}$ , and hence for any $μ \in ℝ^{n}$ , if the d column vectors of σ span $ℝ^{n}$ , i.e., if rank(σ) = n. In the case when rank(σ) = n < d, we have an infinite (continuum) number of solution vectors γ and each corresponds to a (different) risk-neutral measure $\tilde{ℙ} \equiv {\tilde{ℙ}}^{(γ)}$ . This is therefore the case where the risk-neutral measure exists and is not unique. If rank(σ) = n = d, which is the case where the number of stocks equals the number of independent BMs and the n × n matrix σ has an inverse σ−1, then the risk-neutral measure $\tilde{ℙ}$ exists and is uniquely given by γT = σ−1(r1 − μ), i.e., $γ_{j} = {\sum_{i = 1}^{n} (σ^{- 1})}_{j i} (r - μ_{i})$ , j = 1,..., d. Finally, if d < n, then rank(σ) ⩽ d < n (the d column vectors of σ do not span all of $ℝ^{n}$ ) and hence a solution vector γ exists only for μ vectors such that b is in the span of the column vectors of σ. In this case, there does not exist a risk-neutral measure $\tilde{ℙ}$ for arbitrary μ.

11.10.4 Martingale Representation Theorem for Multidimensional BM

In closing this chapter, we simply state (without proof) the multidimensional version of Theorem 11.14. This is of importance when discussing hedging financial derivatives in an economy with multiple assets that are driven by a multidimensional BM. The theorem is quite similar and extends Theorem 11.14 in a rather obvious fashion whereby the Itô integrals, and hence the Itô processes, are defined with respect to a d-dimensional BM where d ⩾ 1. The theorem basically states that a square-integrable $(ℙ, F^{W})$ -martingale is expressible as its initial value plus an Itô integral in the d-dimensional BM and some $F^{W}$ -adapted vector process as integrand. In the result below, we combine Theorem 11.14 and Proposition 11.15 into one Theorem for the more general case of multidimensional BM.

Theorem 11.22

(Multidimensional Brownian Martingale Representation Theorem). Let W(t) = (W1 (t),...,Wd(t)) be a d-dimensional standard BM where $F^{W}$ denotes its natural filtration. Assume {M(t)}0⩽t ⩽ T is a square-integrable $(ℙ, F^{W})$ -martingale. Then, there exists an $F^{W}$ -adapted d-dimensional process (a.s.) θ(t)=(θ1 (t),...,θd(t)), 0 ⩽ t ⩽ T, such that

$M (t) = M (0) + \int_{0}^{t} θ (u) \cdot d W (u) \equiv M (0) + \sum_{j = 1}^{d} \int_{0}^{t} θ_{j} (u) \cdot d W_{j} (u), (11.218)$

for all t ∊ [0, T]. Moreover, let $\hat{ℙ}$ be a measure constructed using Girsanov’s Theorem 11.21 with the assumption that the d-dimensional process {γ(t)}0⩽t⩽T is $F^{W}$ -adapted. If the process ${\hat{M} (t)}_{0 ⩽ t ⩽ T}$ is a square-integrable $(\hat{ℙ}, F^{W})$ -martingale, then there exists an adapted d-dimensiona process ${\hat{θ} (t) = ({\hat{θ}}_{1} (t), ..., {\hat{θ}}_{d} (t))}_{0 ⩽ t ⩽ T}$ , such that (a.s.)

$\hat{M} (t) = \hat{M} (0) + \int_{0}^{t} \hat{θ} (u) \cdot d \hat{W} (u) . (11.219)$

Again we stress that the martingale having this representation is (a.s.) continuous in time (i.e., the process has no jumps) since it is an Itô process.

Exercises

Exercise 11.1. In each case show whether or not the stochastic integral is well-defined. Note: you do not need to compute the values of the integrals.
1. (a) $\int_{0}^{1} (W^{2} (t) + W (t) + 1) d W (t);$
2. (b) $\int_{0}^{1} {| W (t) |}^{- 1 / 2} d W (t);$
3. (c) $\int_{0}^{1} W (\frac{1}{t}) d W (t);$
4. (d) $\int_{0}^{1} {| t W (t) |}^{- 1 / 4} d W (t);$
5. (e) $\int_{0}^{1} W^{a} (t) d W (t) .$
For part (e) find all values of the parameter $a \in ℝ$ for which the integral is well-defined. [Hint for singular integrands: Recall that if t = 0 is a singular point of a function f(t), where $f (t) = O ({| t |}^{p})$ with p < 0, as t → 0, then $\int_{0}^{1} f (t) d t < \infty iff p > - 1$ .]
Exercise 11.2. In each case show whether or not the stochastic integral is well-defined. Note: you do not need to compute the values of the integrals.
1. (a) $\int_{0}^{1} W (2 t) d W (t);$
2. (b) $\int_{0}^{1} W (\frac{t}{2}) d W (t);$
3. (c) $\int_{0}^{\infty} W (\frac{1}{t}) d W (t);$
4. (d) $\int_{0}^{1} {| W (t) |}^{1 / 2} d W (t) .$
Exercise 11.3. For any α ∊ (0, 1), define the stochastic integral

$\int_{0}^{T} W (t) ⋄ d_{α} W (t) : =_{δ (P_{n}) \to 0}^{\lim} \sum_{i = 1}^{n} [α W (t_{i - 1}) + (1 - α) W (t_{i})] (W (t_{i}) - W (t_{i - 1})),$

where Pn = {0 = t0 < t1 < ... < tn = T} is a finite partition of [0, T], and δ(Pn) is the mesh size. Write $\int_{0}^{T} W (t) ⋄ d_{α} W (t)$ as a linear combination of the Itô integral $\int_{0}^{T} W (t) d W (t)$ and the Stratonovich integral $\int_{0}^{T} W (t) \circ d W (t)$ .
Exercise 11.4. Evaluate the following repeated (double) stochastic integral:

$\int_{0}^{t} (\int_{0}^{s} d W (u)) d W (s) .$
Exercise 11.5. Show that $\int_{0}^{t} W^{2} (s) d W (s) = \frac{1}{3} W^{3} (t) - \int_{0}^{t} W (s) d s$ .

[Hint: You may use an appropriate Itô formula.]
Exercise 11.6. Use the Itô isometry property to calculate the variances of the Itô integrals
1. (a) $\int_{0}^{t} {| W (s) |}^{1 / 2} d W (s)$ ,
2. (b) $\int_{0}^{t} {| W (s) + s |}^{2} d W (s)$ ,
3. (c) $\int_{0}^{t} {(W (s) + s)}^{3 / 2} d W (s)$ .
Explain why the above integrals are well-defined.
Exercise 11.7. Calculate

$_{n \to \infty, δ (P_{n}) \to 0}^{\lim} \sum_{i = 1}^{n} (2 W (t_{i - 1}) + W (t_{i})) (W (t_{i}) - W (t_{i - 1})),$

where Pn = {0 = t0 < t1 < t2 < ... < tn = T} is a partition of [0, T] with mesh size $δ (P_{n}) =_{i = 1, ..., n}^{\max} | t_{i} - t_{i - 1} |$ .
Exercise 11.8. Using Itô’s formula, show that the process defined by

$X (t) : = W^{4} (t) - 6 \int_{0}^{t} W^{2} (u) d u,$

t ⩾ 0, is a martingale w.r.t. a filtration for Brownian motion.
Exercise 11.9. Use Itô’s formula to show that for any integer k ⩾ 2,

$E [W^{k} (t)] = \frac{k (k - 1)}{2} \int_{0}^{t} E [W^{k - 2} (s)] d s,$

and use this to derive a formula for all the moments of the standard normal distribution.
Exercise 11.10. Show that M(t) := et/2 sin(W (t)), t ⩾ 0, is a martingale w.r.t. a filtration for Brownian motion.
Exercise 11.11. Use Itô’s formula to show that for any nonrandom, continuously differentiable function f(t), the following formula of integration by parts is true:

$\int_{0}^{t} f (s) d W (s) = f (t) W (t) - \int_{0}^{t} f^{'} (s) W (s) d s .$
Exercise 11.12. Use Itô’s formula to find the stochastic differentials for the following functions of Brownian motion:
1. (a) eW(t);
2. (b) Wk(t), k ⩾ 0;
3. (c) cos(tW(t));
4. (d) $e^{W^{2} (t)}$ ;
5. (e) arctan(t + W(t)).
Exercise 11.13. Using Itô’s formula, show that Y (t) := W3(t) − 3tW(t),t ⩾ 0, is a martingale w.r.t. a filtration for Brownian motion.
Exercise 11.14. Define Z(t) = exp(σW(t)). Use Itô’s formula to write down a stochastic differential for Z(t). Then, by taking the mathematical expectation, find an ordinary (deterministic) first order linear differential equation for m(t) := E[Z(t)] and solve it to show that

$E [\exp (σ W (t))] = \exp (\frac{σ^{2}}{2} t) .$
Exercise 11.15. Let $N (x)$ be the standard normal CDF and consider the process

$X (t) : = N (\frac{W (t)}{\sqrt{T - t}}), 0 ⩽ t < T .$

Express this process as an Itô process, i.e. you need to determine the explicit expressions for the adapted drift μ(t) and diffusion σ(t) and provide the explicit form for X(t) as:

$X (t) = X (0) + \int_{0}^{t} μ (s) d s + \int_{0}^{t} σ (s) d W (s) .$

Show that the process is a martingale w.r.t. any filtration for BM. Find the limiting value $X (T -) =_{t ↗ T}^{\lim} X (t)$ . What is the state space for the X process.
Exercise 11.16. Suppose that the processes X := {X(t)}t⩾0 and Y := {Y(t)}t⩾0 have the log-normal dynamics:

$\begin{array}{l} d X (t) & = X (t) (μ_{X} d t + σ_{X} d W (t)) \\ d Y (t) & = Y (t) (μ_{Y} d t + σ_{Y} d W (t)) . \end{array}$

Show that the process $Z (t) : = \frac{Y (t)}{X (t)}$ is also log-normal, with dynamics

$d Z (t) = Z (t) (μ_{Z} d t + σ_{Z} d W (t)),$

and determine the coefficients μZ and σZ in terms of those of X and Y. Solve the same problem now assuming that X and Y are governed by two correlated Brownian motions WX and WY, respectively, where Corr(WX(t),WY (t)) = ρt, i.e., dWX (t)dWY (t) = ρ dt, for a given correlation coefficient −1 ⩽ ρ ⩽ 1.
Exercise 11.17. Let a time-homogeneous diffusion X(t) have a stochastic differential with drift coefficient function μ(x) = 3x − 1 and diffusion coefficient function $σ (x) = 2 \sqrt{x}$ . Assuming that X(t) ⩾ 0, find the stochastic differential for the process $Y (t) : = \sqrt{X (t)}$ . Find the generator for Y (t).
Exercise 11.18. Let X(t) = tW2(t) and Y(t)= eW(t). Find the stochastic differential of $Z (t) : = \frac{X (t)}{Y (t)}$ . Compute the mean and variance of Z(t).
Exercise 11.19. Let X(t) be a time-homogeneous diffusion process solving an SDE with drift and diffusion coefficient functions μ(x)= cx and σ(x) = σ, respectively, where c, σ are constants and with initial condition $X (0) = x \in ℝ$ . Consider the process defined by $Y (t) : = X^{2} (t) - 2 c \int_{0}^{t} X^{2} (s) d s - σ^{2} t, t ⩾ 0$ .
1. (a) Represent the Y process as an Itô process and show that is a martingale w.r.t. any filtration for Brownian motion.
2. (b) Compute the mean and variance of Y (t) for all t ⩾ 0.
Exercise 11.20. Consider the linear SDE in (11.25) in the case where δ(t) ≡ 0 and where α(t), β(t), γ(t) are continuous nonrandom functions of time t ⩾ 0. Assume a constant initial condition X(0) = x0. Show that the process {X(t)}t⩾0 is a Gaussian process and compute its mean and covariance functions.
Exercise 11.21. Use the Itô formula to write down stochastic differentials for the following processes:
1. (a) $Y (t) = \exp (σ W (t) - \frac{1}{2} σ^{2} t)$ ,
2. (b) Z(t) = f(t)W (t) where f is a continuously differentiable function.
Exercise 11.22. A time-homogeneous diffusion process X has a stochastic differential with respective drift and diffusion coefficient functions μ(x) = 0 and σ(x) = x(1 − x). Assuming 0 < X(t) < 1, show that the process $Y (t) : = \ln (\frac{X (t)}{1 - X (t)})$ has a constant diffusion coefficient.
Exercise 11.23. Let $X (t) : = (1 - t) \int_{0}^{t} \frac{d W (s)}{1 - s}$ , where 0 ⩽ t < 1. Provide the stochastic differential equation for X(t) in the form dX(t) = a(t, X(t)) dt + b(t, X(t)) dW (t). Check your answer by solving the SDE obtained subject to the initial condition X(0) = 0.
Exercise 11.24. Solve the following linear SDEs:
1. (a) dX(t) = W (t)X(t) dt + W (t)X(t)dW (t), X(0) = 1;
2. (b) dX(t) = α(θ − X(t)) dt + σX(t)dW (t), X(0) = x ∊ ℝ;
3. (c) dX(t) = a(t)X(t) dt + σX(t)dW (t), X(0) = x ∊ ℝ.
4. (d) dX(t) = X(t) dt + X(t) dW (t), X(0) = 1.
Assume α, θ, σ are positive constants.
Exercise 11.25. Let g(y) be a given function of y, and suppose that y = f(x) is a solution of the ODE dy = g(y) dx, that is, f′(x) = g(f(x)). Show that X(t) = f(W(t)) is a solution of the SDE

$d X (t) = \frac{1}{2} g (X (t)) g^{'} (X (t)) d t + g (X (t)) d W (t) .$
Exercise 11.26. Use Exercise 11.25 to solve the following nonlinear SDEs, subject to X(0) = x0 ∊ ℝ.
1. (a) $d X (t) = \frac{σ^{2}}{4} d t + σ \sqrt{X (t)} d W (t)$ ;
2. (b) dX(t) = X3(t) dt + X2(t)dW (t);
3. (c) $d X (t) = \frac{1}{2} e^{2 X (t)} d t + e^{X (t)} d W (t)$ .
In each case, find the time interval for which the solution exists. [Hint: In each case, f(x) of Exercise 11.25 is determined by solving a first order separable ODE. For parts (b) and (c) the solution exists up to an “explosion time” when the solution becomes singular.]
Exercise 11.27. Show that for any u ∊ ℝ, the function f(t, x) = exp(ux − u2t/2) solves the backward PDE for Brownian motion. Take the first, second, and third derivatives of exp(ux − u2t/2) w.r.t. u, and set u = 0, to show that functions x, x2 − t, x3 − 3tx also solve the backward equation for Brownian motion. Deduce that W2 (t) − t and W3(t) − 3tW(t) are martingales.
Exercise 11.28. Consider the drifted BM, X(t) = x0 + μt + σW(t), and define the process Y (t) := f(X(t)), t ⩾ 0. By applying an appropriate Itô formula, obtain a general expression for a twice continuously differentiable function f(x), x ∊ ℝ, such that {Y (t)}t⩾0 is a martingale w.r.t. any filtration for BM.
Exercise 11.29. Derive a system of diffusion-type SDEs for the coupled processes X(t) = cos(W(t)) and Y(t) = sin(W(t)).
Exercise 11.30. Consider the process defined by X(t) = sinh (C + t + W (t)), t ⩾ 0, where C = sinh−1 x0 with initial condition X(0) = x0. This process is a diffusion on ℝ and it satisfies an SDE of the form

$d X (t) = μ (X (t)) d t + σ (X (t)) d W (t) .$
1. (i) Find the coefficient functions μ(x) and σ(x).
2. (ii) Provide the backward Kolmogorov PDE and the terminal condition for the transition PDF p(t, T; x, y).
3. (iii) Derive analytical expressions for the transition CDF and PDF of the process X.
Exercise 11.31. Give the probabilistic representation of the solution f(t, x) of the PDE

$\frac{\partial f}{\partial t} + \frac{x^{2}}{2} \frac{\partial^{2} f}{\partial x^{2}} = 0, 0 ⩽ t ⩽ T, f (T, x) = x^{2} .$

Solve this PDE using the solution of the respective SDE.
Exercise 11.32. Consider the boundary value problem for the heat equation:

$\frac{\partial V}{\partial t} + \frac{1}{2} \frac{\partial^{2} V}{\partial x^{2}} = 0, V (1, x) = f (x)$

where f, the boundary value for time t = 1, is given, and where we are looking for a solution V = V(t, x) defined for 0 ⩽ t ⩽ 1 and x ∊ ℝ. Show that the solution is

$V (t, x) = \int_{- \infty}^{\infty} f (y) e^{- \frac{{(x - y)}^{2}}{2 (1 - t)}} \frac{d y}{\sqrt{2 π (1 - t)}} .$

Can you think of a function f for which the solution formula would not make sense?
Exercise 11.33. Consider the boundary value problem for the heat equation with a drift term:

$\frac{\partial V}{\partial t} + \frac{1}{2} \frac{\partial^{2} V}{\partial x^{2}} + a \frac{\partial V}{\partial x} = 0, V (1, x) = f (x)$

where f, the boundary value for time t = 1, is given, and a is a real constant. Derive an explicit (integral) formula for the solution V = V (t, x).
Exercise 11.34. Let f(t, x) satisfy the PDE

$\frac{\partial f}{\partial t} + \frac{1}{2} σ^{2} x^{2} \frac{\partial^{2} f}{\partial x^{2}} + μ x \frac{\partial f}{\partial x} = 0, 0 ⩽ t ⩽ T, x \in ℝ_{+},$

for fixed T > 0, with real constants σ > 0, μ. Solve for f(t, x) subject to the terminal condition $f (T, x) = I_{{K_{1} < x < K_{2}}}$ , where K2 > K1 > 0 are constants.
Exercise 11.35. To compact notation, we suppress all other variables except x′ and denote f(x′) ≡ p(t, t′; x, x′) and g(x′) ≡ p(t′, T; x′, y). Using the definition of $G_{t^{'}, x^{'}}$ , the left-hand integral in (11.69) becomes

$\int_{x} f (x^{'}) G_{t^{'}, x^{'} g} (x^{'}) d x^{'} = \frac{1}{2} \int_{x} f (x^{'}) σ^{2} (t^{'}, x^{'}) \frac{\partial^{2} g}{\partial {x^{'}}^{2}} d x^{'} + \int_{x} f (x^{'}) μ (t^{'}, x^{'}) \frac{\partial g}{\partial x^{'}} d x^{'} .$

Now apply integration by parts to both integrals (in the first integral note that $\frac{\partial^{2} g}{\partial {x^{'}}^{2}} = \frac{\partial}{\partial x^{'}} (\frac{\partial g}{\partial x^{'}})$ ). State appropriate assumptions that allow you to set the boundary terms to zero. Then, apply integration by parts again on the remaining integral containing σ2(t′, x′). Again, state appropriate assumptions that allow you to set the boundary terms to zero. In the end, obtain

$\int_{x} f (x^{'}) G_{t^{'}, x^{'}} g (x^{'}) d x^{'} = \int_{x} g (x^{'}) {\tilde{G}}_{t^{'}, x^{'}} f (x^{'}) d x^{'}$

where ${\tilde{G}}_{t^{'}, x^{'}} f = \frac{1}{2} \frac{\partial^{2}}{\partial {x^{'}}^{2}} (σ^{2} (t^{'}, x^{'}) f) - \frac{\partial}{\partial x^{'}} (μ (t^{'}, x^{'}) f)$ .
Exercise 11.36. Assume that a stock price process {S(t)}t⩾0 satisfies the SDE

$d S (t) = r S (t) + σ S (t) d \tilde{W} (t),$

with constants r,σ > 0, and where ${\tilde{W} (t)}_{t ⩾ 0}$ is a standard $\tilde{ℙ} - BM$ . By using Girsanov’s Theorem, find the explicit expression for the Radon–Nikodym derivative process

$ϱ_{t} : = {(\frac{d \hat{ℙ}}{d \tilde{ℙ}})}_{t}$

such that the process defined by $\hat{S} (t) : = \frac{e^{r t}}{S (t)}, t ⩾ 0$ , is a $\hat{ℙ}$ -martingale. Give the SDE satisfied by the stock price S(t) w.r.t. the $\hat{ℙ} - BM$ .
Exercise 11.37. Consider a stock price process {S(t)}t⩾0 that obeys the SDE

$d S (t) = μ S (t) d t + σ {(S (t))}^{1 + β} d W (t), S (0) = S > 0,$

with constant parameters σ ≠ 0, β, μ.
1. (a) Assume the process $\int_{0}^{t} {(S (u))}^{1 + β} d W (u)$ is a ℙ-martingale w.r.t. the filtration generated by the standard ℙ-Brownian motion {W(t)}t⩾0. Determine an exact expression for the mean of the process E[S(t)]. [Hint: You may write S(t) in terms of an exponential martingale by considering the process defined by X(t) := ln S(t).]
2. (b) Fix T > 0 and assume the existence of a Radon–Nikodym derivative process $ϱ_{t} : = {(\frac{d \tilde{ℙ}}{d ℙ})}_{t}$ , 0 ⩽ t ⩽ T, such that {e−rtS(t)}0⩽t⩽T is a $\tilde{ℙ}$ -martingale. Give the form of $ϱ_{t}$ as an exponential ℙ-martingale.
Exercise 11.38. Consider a one-dimensional general diffusion process {X(t)}t⩾0 having a transition PDF p(s, t; x, y), s < t, w.r.t. a given probability measure ℙ, for all x, y in the state space of the process. Assume a change of measure $ℙ \to \hat{ℙ}$ is defined by a Radon–Nikodym derivative process

$ϱ_{t} : = {(\frac{d \hat{ℙ}}{d ℙ})}_{t} = h (t, X (t))$

for all t ⩾ 0 and where h(t, x) is some given Borel function of t, x. Let $\hat{p} (s, t; x, y)$ be the transition PDF w.r.t. the measure $\hat{ℙ}$ . Show that the two transition PDFs are related by

$\hat{p} (s, t; x, y) = \frac{h (t, y)}{h (s, x)} p (s, t; x, y) .$

[Hint: Consider the definition of the transition CDF (w.r.t. the measure $\hat{ℙ}$ ):

$\hat{p} (s, t; x, y) = \hat{ℙ} (X (t) ⩽ y | X (s) = x) = \hat{E} [I_{{X (t) ⩽ y}} | X (s) = x]$

and make use of the Markov property and the change of measure for computing expectations.]

1The square integrability condition is also denoted by writing X ∈ L2 ([0, T], Ω). Throughout we assume that the integrand process X is measurable. That is, for every Borel set $B \in ℬ (ℝ)$ , the sets ${(t, ω) : X (t, ω) \in B} \in ℬ (ℝ_{+}) \times ℱ$ . By Fubini's Theorem, assuming E[X2 (t)] < ∞ for all t ⩾ 0, then this expectation is a Lebesgue-measurable function of time t and we may interchange the expectation integral with the time integral: $E [\int_{0}^{T} X^{2} (t) d t] = \int_{0}^{T} E [X^{2} (t)] d t$ .

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 11 Introduction to Continuous-Time Stochastic Calculus

Create new playlist

Sign In

Sign Up

Introduction to Continuous-Time Stochastic Calculus

11.1 The Riemann Integral of Brownian Motion

11.1.1 The Riemann Integral

11.1.2 The Integral of a Brownian Path

11.2 The Riemann–Stieltjes Integral of Brownian Motion

11.2.1 The Riemann–Stieltjes Integral

11.2.2 Integrals w.r.t. Brownian Motion

11.3 The Itô Integral and Its Basic Properties

11.3.1 The Itô Integral for Simple Processes

11.3.2 Properties of the Itô Integral

11.4 Itô Processes and Their Properties

11.4.1 Gaussian Processes Generated by Itô Integrals

11.4.2 Itô Processes

11.4.3 Quadratic (Co-) Variation

11.5 Itô's Formula for Functions of BM and Itô Processes

11.5.1 Itô's Formula for Functions of BM

11.5.2 Itô's Formula for Itô Processes

11.6 Stochastic Differential Equations

11.6.1 Solutions to Linear SDEs

11.6.2 Existence and Uniqueness of a Strong Solution of an SDE

11.7 The Markov Property, Feynman–Kac Formulae, and Transition CDFs and PDFs

11.7.1 Forward Kolmogorov PDE

11.7.2 Transition CDF/PDF for Time-Homogeneous Diffusions

11.8 Radon–Nikodym Derivative Process and Girsanov's Theorem

11.8.1 Some Applications of Girsanov’s Theorem

11.9 Brownian Martingale Representation Theorem

11.10 Stochastic Calculus for Multidimensional BM

11.10.1 The Itô Integral and Itô's Formula for Multiple Processes on Multidimensional BM

11.10.2 Multidimensional SDEs, Feynman–Kac Formulae, and Transition CDFs and PDFs

11.10.3 Girsanov’s Theorem for Multidimensional BM

11.10.4 Martingale Representation Theorem for Multidimensional BM

Exercises

Table of Contents for
Chapter 11 Introduction to Continuous-Time Stochastic Calculus