Chapter 7
In this and the following chapter, stochastic differential equations will be formally introduced. This exposition to stochastic calculus does not pretend to be complete. The presentation will be guided by intuition, and important topics and results from a practitioner's point of view will covered at a reasonable mathematical level. General measure theory and other technicalities of a (purely) mathematical interest will be kept at a minimum, but the reader is referred to Arnold [1974], Karatzas and Shreve [1996], Ikeda and Watanabe [1989] and Øksendal [2010] for a detailed account. It should be emphasized that the material in this chapter is not only of interest in mathematical finance. To stress the broad applicability, this chapter does not contain new financial concepts or ideas. A detailed account of these are deferred to the following chapters.
As the successful application of stochastic differential equations in mathematical modelling requires quite a substantial mathematical and statistical setup, we shall now argue why we should bother to consider them.
Application of the nonparametric methods (introduced in Chapter 6) on financial time series revealed some characteristics (e.g., heteroscedasticity) which linear time series models cannot explain, because their conditional mean functions are linear and their conditional variance functions are constants. This is clearly at odds with the small scale empirical studies reported in these notes (and the adjacent exercises) and the large scale studies reported in the open literature. A large number of nonlinear time series models were introduced (in Chapter 5) to model heteroscedasticity. In particular, the GARCH-type models and their numerous extensions performed reasonably well. However, there are a number of important reasons for using differential equations augmented by some kind of randomness or stochasticity.
Stochastic differential equations entail the best of two worlds, i.e., a combination of physical knowledge (laws of motion, preservation of energy etc.) that may be used to develop a deterministic model of the system and statistical methods for parameter estimation and model validation. This allows the modeller to model causality as well as correlation, where causality may be considered as superior to the correlation functions used in traditional time series analysis. There are a number of disadvantages associated with the use of SDEs; one major disadvantage is the advanced probability theory involved. From an empirical point of view, it is by no means trivial to estimate parameters in SDEs, but we shall get back to that in later chapters.
The remainder of this chapter is organized as follows: Section 7.1 briefly considers adding stochasticity to dynamical systems. Section 7.2 informally introduces stochastic calculus while 7.3 considers stochastic integrals. Section 7.4 introduces concepts from stochastic processes and probability theory, and formally introduces Itō calculus. Finally, Section 7.5 provides a brief overview of jump processes and some convenient related mathematical tools.
Assume that we wish to model a general physical, chemical or technical system. Mathematical modelling of such systems often leads to the formulation of a system of coupled (nonlinear) differential equations, which may, in general, be written on the form
dX(t)dt=˙X(t)=f(t,X(t)),(7.1)
where f(t, X(t)) describes the time-directed evolution of the so-called state variables X(t) ∈ ℝ n. The state variables describe the state of the system at time t in the state space.
The derivation of these equations is often based on a number of conceptual, mathematical and numerical approximations and the validity of these are difficult to evaluate per se.
By adding a stochastic term to (7.1) to account for these approximations random differential equations are obtained as illustrated in these examples.
Example 7.1 (Money market account).
Consider the simple money market account introduced in Definition 2.2 on page 27, i.e.,
dB(t)=r(t)B(t)dt,(7.2)B(0)=1,(7.3)
where B(t) is the value of the money account at time t, and r(t) denotes the relevant (there are many different ones!) interest rate.
It is very likely that the interest rate evolves randomly over time, i.e., we have
r(t)=˜r(t)+σ"noise"(t)(7.4)
where ˜r(t) is assumed to be deterministic. If we insert this in (7.2), we get
dB(t)=(˜r(t)+σ"noise"(t))B(t)dt,B(0)=1(7.5)
where σ denotes the standard deviation of the noise. The question is now how do we formalize the concept of “noise” such that (7.5) makes sense and how do we solve it?
Example 7.2 (Stock prices).
We have previously argued that the volatility of stock prices, foreign exchange rates and interest rates depend on the current level, i.e.,
dS(t)=αS(t)dt+"noise"(t)S(t)dt,S(0)=s(7.6)
which is essentially similar to (7.5).
Example 7.3 (Simple Black-Scholes).
Consider a simple financial market with two assets:
We propose the model
dS(t)=αS(t)dt+"noise"(t)S(t)dt,S(0)=s(7.7)dB(t)=r(t)B(t)dt,B(0)=1(7.8)
We get the celebrated Black-Scholes model, when we choose the so-called Brownian motion for the noise process in (7.7). This model will be described in detail later.
The discussion above raises a number of questions about the mathematical and statistical nature of the added stochastic term. This chapter is devoted to answering these questions.
The point of departure in our search for a formal definition of the noise terms in the previous examples will be the random difference equation (7.9) with ΔW (t) = W(t + Δt) − W(t).
X(t+Δt)−X(t)=μ(t,X(t))Δt+σ(t,X(t))ΔW(t)(7.9)
where W(t) is a normally distributed random variable with zero mean and a variance that is proportional to Δt. Furthermore W(t) is assumed to be independent of all prior values of the process Ws, s < t, and μ(·, ·) and σ(·, ·) are a priori known functions.
Remark 7.1 (Other driving processes).
The driving noise process W(t) in the random difference equation (7.9) need not be a normally distributed random variable. It could easily be, say, a Poisson process or a compound Poisson process, which could account for completely unpredictable phenomena, such as attacks on some currency in the foreign exchange markets or the effects of earthquakes. We will present a brief introduction to jump processes in Section 7.5.
In order to obtain a more mathematical description of (7.9), a more formal definition of the noise process W(t) is required. In particular, we need a process that generates mutually independent and identically distributed normal random variables with zero mean and a variance that is proportional to Δt. A definition that also makes sense when we consider the limiting behaviour of (7.9) as Δt tends to 0.
One possibility is to consider a Brownian motion, named after the Scottish botanist Robert Brown, who used the process to describe the irregular movements of pollen suspended in water. This random movement, usually attributed to the buffeting of the pollen by water molecules, results in a diffusion of the pollen in the water. Brownian motion is thus a physical example of a random and continuous stochastic process.
A standard Wiener process is an abstract mathematical description of the physical process of Brownian motion. The mathematical properties defining a Wiener process, {W(t), t ≥ 0}, are given in
Definition 7.1 (The Wiener process).
A stochastic process [W (t); t ≥ 0] is said to be a Wiener process if it satisfies the following conditions:
3. The increments W(t) − W(s) for any 0 ≤ s < t are normally distributed with mean and variance, respectively,
E[W(t)−W(s)]=0,(7.10)Var[W(t)−W(s)]=t−s,(7.11)
i.e., W(t) − W(s) ∈ N(0, t − s).
It follows from (7.10) that the mean of the process is zero for any time interval, whereas the variance grows unboundedly as the length of the time interval t − s is increased.
Using this definition of the Wiener process, we can write (7.9) as
X(t+Δt)−X(t)=μ(t,X(t))Δt+σ(t,X(t))ΔW(t)(7.12)
where
ΔW(t)=W(t+Δt)−W(t).(7.13)
Let us now try to formalize (7.9) slightly by dividing through by Δt and then letting Δt tend to 0. Formally we should obtain
˙X(t)=μ(t,X,(t))+σ(t,X(t))V(t),X(0)=x(7.14)
where we have added an initial value x and introduced V(t) as the formal time derivative of the Wiener process.
Assuming that V(t) is a well defined process, it should now be possible to solve (7.12) for every realization or trajectory of V(t). It can be shown that the process V(t) is unfortunately not well defined as the Wiener process is nowhere differentiable, although it is continuous. For illustration consider the limit
lim h→0E[(W(t+h))2]−E[(W(t))2]h=t+h−th=1.
Thus in a mean square sense the derivative of the Wiener process W(t) is not the derivative process V(t) = W(t) as defined above.
The Wiener process is a Markov process as well as a martingale as we shall see later. The sample paths (realizations) of the process are continuous with probability one, but they are nowhere differentiable with probability 1 due to the (independent) increments (see e.g. Øksendal [2010] for a rigorous proof).
Another approach is to let Δt tend to zero in (7.12) without dividing through by Δt. Formally we get
dX(t)=μ(t,X(t))dt+σ(t,X(t))dW(t),X(0)=x(7.15)
and it is natural to interpret (7.15) as a shorthand notation for the following integral equation
X(t)=x+t∫0μ(s,X(s))ds+t∫0σ(s,X(s))dW(s).(7.16)
The ds integral may be interpreted as an ordinary Riemann integral, whereas the natural interpretation of the dW(s) integral is as an Riemann-Stieltjes integral for every trajectory of W. Unfortunately this is not reasonable as it can be shown that the process W(t) is of unbounded variation, i.e. the dW (s) integral in (7.16) is divergent.
Strictly speaking, the notation in (7.15) does not make any sense as it describes the infinitesimal evolution of X(t), which is driven by a Wiener process with unbounded variation. We shall, however, use the notation (7.15) for convenience repeatedly in the following, but it should be remembered that it is only shorthand for (7.16).
The remaining questions are now
Although the Wiener process has some simple probabilistic properties it is by no means simple to define stochastic integration with respect to a Wiener process, because the trajectory of a Wiener process is very odd. Let us list some of its peculiar properties
Nevertheless, we intend to introduce the stochastic integral
I(t,ω)=t∫0g(s,ω)dW(s),(7.17)
where g(t, ω) is some suitably, smooth (possibly random) function in the following scheme, which is identical to the definition of the Riemann integral:
Define for each trajectory ω an approximate integral In (ω) by
In(t,ω)=n−1∑k=0g(τk,ω)[W(tk+1,ω)−W(tk,ω)](7.18)
where τk is some arbitrarily chosen time in the interval [tk, tk+1).
The objective of the following discussion is to show that it is important where in the time interval [tk, tk+1[the function g(τk, ω) is evaluated. Recall that various choices of τk, ∈ [tk, ε tk+1) yield the same results in ordinary calculus. We shall now show that this does not hold for stochastic calculus.
As an example, let us consider the case g(t) = W(t), i.e. we wish to compute the stochastic integral
I(t)=t∫0W(s)dW(s)(7.19)
where we choose to compute the integral from t0 = 0 instead of the more general t0, because we may use that W (0) = 0 to obtain a shorter formula.
As a preparation it is convenient first to consider the quadratic variation of W(t) on the interval [0, t], i.e. we commence by considering the integral
∫t0(dW(s))2.(7.12)
Thus we introduce the notation ΔWk = W(tk=1) − W(tk) and define the stochastic variable
Sn=n−1∑k=0(ΔWk)2.(7.21)
If the Wiener process was differentiable, we would expect that Sn would converge to zero as n tends to infinity, because the time interval [0, t] is finite. Let us introduce the subintervals Δt Δ tk=1 − tk, i.e. Δt = t/n. From Definition 7.1, it immediately follows that E [(ΔWk)2] = Δtk and thus
E[Sn]=n−1∑k=0E[(ΔWk)2]=n−1∑k=0Δtk=t.
The variance of Sn is found by direct calculation
Var[Sn]=n−1∑k=0Var[(ΔWk)2]=2n−1∑k=0(Δtk)2=2n(tn)2=2t2n
where it is used that (ΔWk)2 ∈ Δtkχ2 (1). It is well known that a sum of N χ2(1) distributed random variables is a χ2(N) distributed variable with mean N and variance 2N. In other words, we have
Var[Sn]=E[(Sn−E[Sn])2]=E[(Sn−t)2]=2t2n
and thus
lim n→∞E[(Sn−t)2]=0.
In this case, we say that Sn converges towards t in a mean square sense or in the space L2(dℙ × dt). This result is the foundation of the so-called Itō formula, which plays a fundamental role in stochastic calculus as the stochastic counterpart of the well-known chain rule from ordinary calculus.
The main result may be restated in differential form as
(dW(t))2=dt.(7.22)
Formally this metatheorem does not make any sense, but it is worth noticing that it states that the square of a stochastic increment yields a purely deterministic property. Do, please, remember this result.
Let us return to the evaluation of (7.19). We proceed in a similar fashion as above by constructing sums of the form (7.21). We consider two different sums which evaluate the W(t) part at either the left hand side of the interval [tk, tk+1[, τk = tk, or the right hand side τk = tk+1, i.e.
An=n−1∑k=0W(tk)(W(tk+1)−W(tk))(τk=tk),(7.23)Bn=n−1∑k=0W(tk+1)(W(tk+1)−W(tk))(τk=tk+1).(7.24)
We immediately get the identities
An+Bn=W2(t),(7.25)Bn−An=n−1∑k=0(ΔWk)2=Sn,(7.26)
for n → ∞, where Sn is given by (7.21). It immediately follows that Bn − An → t in L2 as n → ∞. We therefore get the limits
An→A,Bn→B,
A=W2(t)2−t2,(7.27)B=W2(t)2+t2.(7.28)
These results show that the value of the stochastic integral (7.19) depends critically on the placement of τk in the interval [tk; tk+1), i.e. the integral depends on where the integrand is evaluated in the interval [tk; tk+1). Needless to say, this is not the case in ordinary calculus.
By choosing τk = tk, we get the enormously important Itō integral, which yields
t∫0W(s)dW(s)=W2(t)2−t2.(7.29)
By choosing τk = tk+1, we get
t∫0W(s)dW(s)=W2(t)2+t2.(7.30)
Note that in both cases, we get the additional term t/2 compared to ordinary calculus. Finally, choosing tk = (tk +tk+1)/2 yields the Stratonovich integral
t∫0W(s)dW(s)=W2(t)2,(7.31)
which is similar to classical calculus. However, there is a consensus that the Itō integral is the only appropriate integral for financial modelling.
In this section we formally introduce the Itō stochastic integral. Therefore some concepts from probability theory will be repeated for convenience.
We assume the existence of a filtered probability space (Ω,ℱ,ℙ), where ℱ is a σ-algebra on the sample space Ω of possible outcomes, (Ω,ℱ) is a measurable space and ℙ: ℱ ↦ [0,1] is some probability measure.
Definition 7.2 (Filtration).
A filtration on (Ω,ℱ) is a family {ℱ(t)}t≥0 of σ-algebras ℱ(t) ⊂ ℱ such that
ℱ(s)⊆ℱ(t) for 0≤s<t.
Generally speaking, ℱ(s) denotes the set of events (or the information set) up to time s. The natural filtration {ℱ(t)}t≥0 is increasing and right continuous, i.e. at time t, 0 ≤ s < t, more information is available (or, at least, information is not lost) ℱ (s) ⊂ ℱ (t) than at time s and in the limit complete information is obtained ℱ (∞) = ℱ. Application of the natural filtration {ℱ(t)}t≥0 implies that information about X(t) in (7.15) must be deduced from observations of X(t) as opposed to, e.g., Y(t) = f(X(t)), where f : ℝ → ℝ is some nontrivial (possibly nonlinear) function.
Example 7.4.
Consider the function Y (t) = |X(t)|. Here, the value of Y(t) is known when knowing X(t), but the converse does not hold.
Remark 7.2.
Consider a stochastic variable X(t) as a function X(t): Ω → ℝ that maps the sample space Ω into ℝ. If {ω ε Ω: X(t, ω) ≤ x} ε ℱ for each x ε ℝ, then X(t) is said to be ℱ(t)-measurable.
Definition 7.3 (Martingale).
A stochastic process {X(t), t ≥ 0} on the probability space (Ω, ℱ, ℙ) is called a martingale with respect to a filtration {ℱ(t)}t≥0 if
Definition 7.4 (Adapted process).
The stochastic process X(t) is adapted to the filtration ℱ(t) if X(t) is an ℱ(t)-measurable random variable for each t ≥ 0.
Remark 7.3 (Adaptedness).
It is instructive to think of measurability and adaptedness in the sense that if a function g(t) is said to be ℱ(t)-measurable, then it essentially means that g(t) is known at time t.
Example 7.5.
A Wiener process W(t) that is adapted to a given filtration ℱ(t) possesses the property that
W(t)−W(s) is independent of ℱs.(7.32)
The process W(t) is then said to be a ℱt-Wiener process.
Please, refer to the Appendix for a more detailed exposition to these concepts or consult the references given in the introduction to this chapter.
Definition 7.5 (The class ℒ2).
Let ℒ2[a, b] denote the class of processes g(s, ω) that satisfies the conditions:
The integral
b∫aE[(g(s,ω))2]ds<∞(7.33)
is finite.
For some a ≤ b we now define the stochastic integral
b∫ag(s,ω)dW(s)(7.34)
for all g ε ℒ2 [a, b]. We shall only consider simple functions (to be defined below) and leave the generalization to the interested reader.
Assume that g is simple, i.e. there exist deterministic time instants a = t0 < t1 < ... < tn = b such that
g(s,ω)=g(tk,ω) for sε[tk,tk+1[
where
g(tk,ω)εℱ(tk)k=0,...,n.
In other words g(tk, ω) is ℱ(tk)-measurable, i.e. g(tk) is known at time tk.
For a simple process g we define the stochastic integral by a sum similar to (7.23)
b∫ag(s,ω)dW(s)=n−1∑k=1g(tk,ω)(W(tk+i)−W(tk)).(7.35)
It is inherently important that we define the incremental Wiener process in terms of the forward differences W(tk+1) − W(tk).
Theorem 7.1 (Stochastic integration rules).
Let g and h be simple processes that satisfy (7.33) and let α, β be real numbers. The following rules apply
Stochastic integrals are linear operators
b∫s(αg(s)+βh(s))dW(s)=αb∫ag(s)dW(s)+βb∫ah(s)dW(s).(7.36)
The unconditional expectation of a stochastic integral when g ε ℒ2 [a, b] is zero
E[b∫ag(s)dW(s)]=0.(7.37)
Stochastic integrals are measurable with respect to the filtration generated by the Wiener process, i.e.
b∫ag(s)dW(s) is ℱ(b)−measurable.(7.38)
Stochastic integrals when g ε ℒ2 [a, b] are martingales
E[b∫ag(s)dW(s)|ℱ(a)]=0.(7.39)
The Itō isometry is a convenient way of computing variances when g ∈ sℒ2 [a, b]
E[(b∫ag(s)dW(s))2]=b∫aE[g2(s)]ds(Itˉo isometry).(7.40)
It also applies to covariance
E[(b∫ag(s)dW(s))(b∫ah(s)dW(s))]=b∫aE[g(s)h(s)]ds.(7.41)
Proof. That the Itō integral is a linear operator is trivial and is left as an exercise for the reader.
To make the notation less cumbersome, we introduce the entities
gk=g(tk),ΔWk=W(tk+1)−W(tk),Δtk=tk+1−tk,ℱk=ℱ(tk).(7.42)
We get
E[b∫ag(s)dW(s)]=n−1∑k=0E[gkΔWk].(7.43)
If we use the fact that the process gk is adapted to the filtration ℱ(tk), we get
E[gkΔWk]=E[E[gkΔWk|ℱ(tk)]]=E[gkE[ΔWk|ℱ(tk)]],(7.44)
where we have used the standard trick (iterated expectations) of introducing a conditioning argument and taken the expectation with respect to that argument. As the Wiener process has independent increments, we get
E[gkE[ΔWk|ℱ(tk)]]=0
and we have proved (7.37).
Next we shall prove (7.40). By introducing the well-known sum, we get
E[(b∫ag(s)dW(s))2]=∑i,jE[gigj(ΔWi)(ΔWj)]
where we need to consider two cases:
E[g2i(ΔWi)2]=E[E[g2i(ΔWi)2|ℱi]]=E[g2iE[(ΔW)2|ℱi]]=E[g2iΔti]=E[g2i]Δt.
For i ≠ j with, say i < j, we get
E[gigj(ΔWi)(ΔWj)]=E[E[gigj(ΔWi)(ΔWj)|ℱj]]=E[gigj(ΔWi)E[(ΔWj)|ℱj]]=0
as the Wiener increment has the conditional mean 0.
Thus we have
E[(b∫ag(s)dW(s))2]=∑i,jE[g2i]Δt=b∫aE[g2i(s)]ds.(7.45)
Equation (7.41) may be shown in a similar fashion. Eq. (7.38) follows immediately from the definition of the stochastic integral, and (7.39) is shown as (7.37).
Remark 7.4 (Itō isometry).
Note that (7.40) establishes an isometry between stochastic integrals and deterministic integrals. This is very useful for the calculation of variances.
Remark 7.5.
The rules in Theorem 7.1 may be extended to cover a larger class of functions than the simple functions considered above by considering Cauchy sequences in ℒ2 of simple functions, but we will not go into the details here.
Remark 7.6.
It is possible to extend stochastic integration to all adapted processes g which satisfy the condition
ℙ[t∫0g2(s)ds<∞]=1.
For all such g it is not guaranteed that (7.37), (7.40) and (7.39) are valid, but the properties (7.38) and (7.36) still hold. These stochastic integrals are known as local martingales.
It is easy to show that the Wiener process is in itself an ℙ-martingale and it is a very important consequence of Theorem 7.1 that the martingale property is preserved with respect to integration of ℒ2-processes.
Theorem 7.2 (Continuous trajectories).
Assume that g ∈ ℒ2[0,t] for all t ≥ 0. Define the process X by
X(t)=t∫0g(s)dW(s).(7.46)
Then X(t) is a martingale with continuous trajectories.
Proof. By direct calculation we get
X(t)=t∫0g(u)dW(u)=s∫0g(u)dW(u)+t∫sg(u)dW(u)=Xs+t∫sg(u)dW(u).
Using (7.37) we get
E[X(t)|ℱ(s)]=Xs+E[t∫sg(u)dW(u)|ℱ(s)]=X(s).
The continuity of the trajectories is difficult to prove, but it should be intuitively clear as the Wiener process lacks jumps.
It is possible to extend the theory on stochastic integration to discontinuous processes, Cont and Tankov [2004] being a good start. The simplest example of a discontinuous process with iid increments is the Poisson process.
Definition 7.6 (Poisson process).
A Poisson process is an integer-valued stochastic process {N (t), t ≥ 0} satisfying the following conditions:
There are obvious similarities (and differences) between the Wiener process (7.1) and the Poisson process.
Jump processes are easier to analyse if we introduce some well-known transform methods (Fourier transforms, etc.).
Definition 7.7 (Characteristic function).
The Fourier transform of a random variable or process is called the characteristic function
ψX(u)=E[eiuX].(7.47)
Characteristic functions are incredibly useful in probability, as, e.g., the distribution of sums of iid random variables is computed using convolution of the densities. A simpler alternative is to use Fourier methods. This can be seen by computing the characteristic function for the sum
ψX1+X2(u)=E[eiu(X1+X2)]=E[eiuX1]E[eiuX2]=ψX1(u)ψX2(u),(7.48)
where we use the independence of the random variables to factor the expectation.
Example 7.6 (Gaussian).
The characteristic function for a Gaussian random variable X with mean μ and covariance Σ is given by
Example 7.7 (Poisson).
The characteristic function for a Poisson random variable with parameter λ is given by
Example 7.8 (Compound Poisson process).
A compound Poisson process is defined as
where N(t) is a Poisson process and {Yn,n ε N} are iid random variables independent of N. The convention is that no terms are included in the sum before N(t) reaches one
The compound Poisson process is a nice model for large, unexpected, rare events such as government interventions, earthquakes, etc.
Theorem 7.3.
The characteristic function for a compound Poisson process is given by
where λ is the jump intensity and ψY (·) is the characteristic function for the jumps Y.
Proof. The characteristic function is computed, using iterated expectations as
Here, we recognize that this is in fact the probability generating function, g(z) = E[zN(t)] = eλt(z−1), for a Poisson random variable, evaluated at ψY(u), concluding the proof.
Compound Poisson processes, as well as Wiener processes, are special cases of a more general class of processes, namely Lévy processes.
Definition 7.8 (Lévy process).
A cadlag1 process {X(t), t ≥ 0} is called a Lévy process if it satisfies the following conditions
The paths are continuous in probability,
Theorem 7.4 (Lévy-Khinchin representation).
Let {X(t)} be a Lévy process with a characteristic triplet (b, Σ, ν). Then
with the characteristic exponent
where u, b ∈ ℝd, Σ is a non-negative d × d matrix and ν is a measure on ℝd with ν({0}) = 0 and ∫min(||x||, 1)ν(dx)< ∞
The first two parameters in characteristic triplet (b, Σ, ν) can be identified as the drift and diffusion in a Brownian motion with drift; cf. (7.49). The measure ν is called the Lévy measure and controls the jumps. It is defined, for some Borel set A ∈ ℬ(ℝd), as
We will see in Section 9.6 how characteristic functions can be used to value a large class of options, under rather general models.
Definition 7.9 (Merton).
The Merton model (Merton [1976]), is a simple jump process. The log spot price is modelled as a compound Poisson process with Gaussian ? (μ, δ2) jumps with intensity λ
The conditional distribution generated by the Merton model is a mixture of Gaussians. Option prices computed using the Merton model will therefore be a mixture of Black & Scholes prices.
It follows from Equation (7.49) and Equation (7.53) that the characteristic function (assuming S(0) = 1) is given by
where the second line presents the characteristic exponent.
We can easily find how to choose the parameter γ such that the discounted process becomes a martingale. Evaluating the characteristic function in u = −i yields
Doing this for the Merton model gives
implying that
transforms the discounted price process into a martingale
Definition 7.10 (Variance Gamma process).
The Variance Gamma (VG) process (Madan and Seneta [1990]), is a time-shifted Wiener process, where the time shift is controlled by a Gamma process Γ(t;1;ν). The Variance Gamma process is then defined as
This definition is very useful for Monte Carlo simulations.
The characteristic function for a Variance Gamma process (Cont and Tankov [2004], Hirsa [2013]) is given by
Lévy processes that are defined as time-shifted Brownian motions are commonly referred to as Subordinated Brownian motions.
Definition 7.11 (NIG process).
The Normal Inverse Gaussian (NIG) (Barndorff-Nielsen [1997]), is similar to the VG process, the difference being that the time shift process is an Inverse Gaussian (IG) process, rather than a Gamma process. The corresponding characteristic function is given by
Definition 7.12 (Time-shifted Lévy processes).
The processes Defined in Definition 7.9-7.11 all have iid increments, while it is well known that real world data typically exhibit time varying volatility. This can be achieved by another time shift, this time using an integrated, positive process. One of the most popular time shifts is to use an integrated Cox–Ingersoll-Ross (CIR) model (Cox et al. [1985]) (Stochastic differential equation will be introduced in Chapter 8). The Cox–Ingersoll-Ross model is given by the stochastic differential equation
It is well known that this process is positive. Integrating this process
generates a time shift process.
A time-shifted Variance Gamma or NIG process would then be defined as
The characteristic function can be derived (see Hirsa [2013]), arriving at
which is rather similar to Equation (7.53). Finally, the characteristic function for the integrated CIR process is given by
with
Time-shifted Lévy processes provide a very good fit to market data (Lindström et al. [2008]).
The characteristic function can also be derived for some stochastic volatility models, most notably the Heston model (Heston [1993]).
Definition 7.13.
The risk-neutral version of the Heston stochastic volatility model is given by
where the driving Wiener processes are allowed to be correlated on an infinitesimal scale dW(S)(t)dW(V)(t) = ρdt.
It can be shown that the characteristic function for the logarithmic stock price, X(t) = log(S(t)), is given by
where
Extending the Heston characteristic function to the Bates model (Bates [1996]), is rather straightforward.
Definition 7.14.
The Bates model is a Heston model, with independent jumps in the S component, formally defined as
where the driving Wiener processes once again are allowed to be correlated, dW(S) (t)dW(v) (t) = ρdt, while J(t) is a compound Poisson process with intensity λ and lognormal distributed jumps of size k such that log(1 + k) ∈ N(μ, δ2). The jumps are independent of the diffusion part, although it is still possible to derive the joint characteristics function when the jump intensity is a linear function of the state variables (Duffie et al. [2003]).
Computing the logarithm of the stock price X(t) = log(S(t)) leads to the dynamics
The discounted price process will therefore be a risk-neutral martingale if the risk-free rate in the Heston models is replaced by
The characteristic function for the Bates model is, due to the independence between the jumps and the Wiener processes, given by a multiplication of the Heston characteristic function, replacing r with , while the jump term given by
leads to the joint expression
Problem 7.1
Problem 7.2
Referring to (7.18), the important Stratonovitch integrals are obtained by introducing
i.e. the integrand is evaluated at the midpoint of the interval [tk, tk+1[.
Although it may be shown that Stratonovitch integrals are neither Markov processes nor martingales, they are important for theoretical work because the ordinary chain rule applies for variable transformations.
Problem 7.3
Let B(t) denote a standard Brownian motion (a Wiener process) on the probability field (Ω, ℱ, ℙ) and let ℱ(t) be the natural filtration generated by B(t).
2. Show that only one of the following is a martingale
1Right continuous with left limits.