The Kalman filter

The Kalman filter is a mathematical model that provides an accurate and recursive computation approach to estimate the previous states and predict the future states of a process for which some variables may be unknown. R. E. Kalman introduced it in the early 60s to model dynamics systems and predict trajectory in aerospace [3:10]. Today, the Kalman filter is used to discover a relationship between two observed variables that may or may not be associated with other hidden variables. In this respect, the Kalman filter shares some similarities with the Hidden Markov models (HMM) described in Chapter 6, Regression and Regularization [3:11].

The Kalman filter is used as:

  • A predictor of the next data point from the current observation
  • A filter that weeds out noise by processing the last two observations
  • A smoother that computes trends from a history of observations

Note

Smoothing versus filtering

Smoothing is an operation that removes high-frequency fluctuations from a time series or signal. Filtering consists of selecting a range of frequencies to process the data. In this regard, smoothing is somewhat similar to low-pass filtering. The only difference is that a low-pass filter is usually implemented through linear methods.

Conceptually, the Kalman filter estimates the state of a system from noisy observations. The Kalman filter has two characteristics:

  • Recursive: A new state is predicted and corrected using the input of a previous state
  • Optimal: This is an optimal estimator because it minimizes the mean square error of the estimated parameters (against actual values)

The Kalman filter is one of the stochastic models that are used in adaptive control [3:12].

Note

Kalman and nonlinear systems

The Kalman filter estimates the internal state of a linear dynamic system. However, it can be extended to a model nonlinear-state space using linear or quadratic approximation functions. These filters are known as, you guessed it, extended Kalman filters (EKF), the theory of which is beyond the scope of this book.

The following section is dedicated to discrete Kalman filters for linear systems, as applied to financial engineering. A continuous signal can be converted to a time series using the Nyquist frequency.

The state space estimation

The Kalman filter model consists of two core elements of a dynamic system—a process that generates data and a measurement that collects data. These elements are referred to as the state space model. Mathematically speaking, the state space model consists of two equations:

  • Transition equation: This describes the dynamics of the system including the unobserved variables
  • Measurement equation: This describes the relationship between the observed and unobserved variables

The transition equation

Let's consider a system with a linear state xt of n variables and a control input vector ut. The prediction of the state at time t is computed by a linear stochastic equation:

The transition equation
  • A is the square matrix of dimension n that represents the transition from state x at t-1 to state x at t. The matrix is intrinsic to the dynamic system under consideration.
  • B is an n by n matrix that describes the control input model (external action on the system or model). It is applied to the control vector u.
  • w represents the noise generated by the system or from a probabilistic point of view, the uncertainty on the model. It is known as the process white noise.

The control input vector represents the external input (or control) to the state of the system. Most systems, including our financial example later in this chapter, have no external input to the state of the model.

Note

White and Gaussian noise

A white noise is a Gaussian noise, following a normal distribution with zero mean.

The measurement equation

The measurement of m values zt of the state of the system is defined by the following equation:

The measurement equation
  • H is a matrix m by n that models the dependency of the measurement to the state of the system.
  • v is the white noise introduced by the measuring devices. Similar to the process noise, v follows a Gaussian distribution with zero mean and a variance R, known as the measurement noise covariance.

The recursive algorithm

The set of equations for the discrete Kalman filter are implemented as recursive computation with two distinct steps:

  • The algorithm uses the transition equations to estimate the next observation
  • The estimation is created with the actual measurement for this observation

The recursion is visualized in the following diagram:

The recursive algorithm

An overview diagram of the recursive Kalman algorithm

Let's illustrate the prediction and correction phases in the context of filtering financial data, in a manner similar to the moving average and Fourier transform. The objective is to extract the trend and the transitory component of the yield of the 10-year Treasury bond. The Kalman filter is particularly suitable for the analysis of interest rates for two reasons:

  • Yields are the results of multiple factors, some of which are not directly observable
  • Yields are influenced by the policy of the Federal Reserve that can be easily modeled by the control matrix

The 10-year Treasury bond has a higher trading volume than bonds with longer maturity, making trends in interest rates a bit more reliable [3:13].

Applying the Kalman filter to clean raw data requires you to define a model that encompasses both observed and non-observed states. In the case of the trend analysis, we can safely create our model with a two-variable state: the current yield xt and the previous yield xt-1.

Note

State in dynamic systems

The term "state" refers to the state of the dynamic system under consideration. This is a different term for observation, data, or value vector. A state or observation is a set of values, one for each variable of the model.

This implementation of the Kalman filter uses the Apache Commons Math library, which defines and manipulates specific types. The first step is to define the implicit type conversion required to interface with the KalmanFilter class:

type DblMatrix = Array[Array[Double]]
type DblVector = Array[Double]
implicit def double2RealMatrix(x: DblMatrix): RealMatrix = new Array2DRowRealMatrix(x)
implicit def double2RealRow(x: DblVector): RealMatrix = new Array2DRowRealMatrix(x)
implicit def double2RealVector(x: DblVector): RealVector = new ArrayRealVector(x)

The implicit type conversion has to be defined in the scope of the client code.

The Kalman model assumes that process and measurement noise follow a Gaussian distribution, also known as white noise. For the sake of maintainability, the generation or simulation of the white noise is encapsulated in the QRNoise class with qr as the tuple of scale factors for the process noise matrix Q and the measurement noise R. The two create methods execute the user-defined noise function white:

class QRNoise(qr: XY, white: Double=> Double) {
  def q = white(qr._1) 
  def r = white(qr._2) 
  def noisyQ = Array[Double](q,q)
  def noisyR = Array[Double](r,r)
}

The easiest approach to manage the matrices and vectors used in the recursion is to define them as parameters of the main class, DKalman:

class DKalman(A:DblMatrix, B:DblMatrix, H:DblMatrix, P:DblMatrix)(implicit val qrNoise: QRNoise) extends PipeOperator[XY,XY] {

  val Q =  new DblMatrix(A.size).map(_ => Array.fill(A.size)(qrNoise.qr.1))  

  var x: RealVector = _
  var filter: KalmanFilter =_
}

The matrix used in the prediction and correction phase is defined as an argument of the DKalman class. The matrices for the covariance of the process noise Q and the measurement noise R are also initialized during the instantiation of the Kalman filter class. The key elements of the filter are now in place and it's time to implement the prediction-correction cycle portion of the Kalman algorithm.

Prediction

The prediction phase consists of estimating the x state (yield of the bond) using the transition equation. We assume that the Federal Reserve has no material effect on the interest rates, making control input matrix B null. The transition equation can be easily resolved using simple operations on matrices.

Prediction

Visualization of the transition equation of the Kalman filter

The purpose of this exercise is to evaluate the impact of the different parameters of the transition matrix A in terms of smoothing.

Note

The control input matrix B

In this example, the control matrix B is null because there is no known deterministic external action on the yield of the 10-year Treasury bond. However, the yield can be affected by unknown parameters that we represent as hidden variables. The matrix B would be used to model the decision of the Federal Reserve regarding asset purchases and federal fund rates.

The mathematics behind the Kalman filter presented as reference to its implementation in Scala use the same notation for matrices and vectors. It is absolutely not a prerequisite to understand the Kalman filter and its implementation in the next section. If you have a natural inclination toward linear algebra, the following note describes the two equations for the prediction step.

Note

The prediction step

The prediction of the state at time t+1 is computed by extrapolating the state estimate:

Prediction
  • A is the square matrix of dimension n that represents the transition from state x at t-1 to state x at time t.
  • Prediction is the predicted state of the system based on the current state and the model A
  • B is the vector of n dimension that describes the input to the state

The mean square error matrix P, which is to be minimized, is updated through the following formula:

Prediction
  • AT is the transpose of the state transition matrix.
  • Q is the process white noise described as a Gaussian distribution with a zero mean and a variance Q, known as the noise covariance.

The state transition matrix is implemented using the matrix and vector classes included in the Apache Commons Math library. The types of matrices and vectors are automatically converted into RealMatrix and RealVector classes. The implementation of the equation is as follows:

x = A.operate(x).add(qrNoise.noisyQ)

The new state is predicted (or estimated), and then used as an input to the correction step.

Correction

The second and last step of the recursive Kalman algorithm is the correction of the estimated yield of the 10-year Treasury bond with the actual yield. In this example, the white noise of the measurement is negligible. The measurement equation is simple because the state is represented by the current and previous yield, and their measurement z:

Correction

Visualization of the measurement equation of the Kalman filter

The sequence of mathematical equations of the correction phase consists of updating the estimation of the state x using the actual values z, computing the Kalman gain K, and estimating the matrix of the error covariance P.

Note

Correction step

The state of the system x is estimated from the actual measurement z through the following formula:

Correction
  • r is the residual between the predicted measurement and the actual measured values
  • K is the Kalman gain for the correction factor Kr

The Kalman gain is computed using the estimated error covariance matrix Correction:

Correction
  • HT is the matrix transpose of H

Finally, the estimate of the error covariance matrix Correction is corrected to the value Pt through the following formula:

Correction
  • Id is the identity matrix.

Kalman smoothing

It is time to put our knowledge of the transition and measurement equations to the test. The Apache Commons Library defines two classes, DefaultProcessModel and DefaultMeasurementModel, to encapsulate the components of the matrices and vectors. The historical values for the yield of the 10-year Treasury bond is loaded through the DataSource method and mapped to the smoothed series that is the output of the filter.

def |> : PartialFunction[XTSeries[XY], XTSeries[XY]] = {
   case xt: XTSeries[XY] if(xt.size> 0) => xt.map( y => {
      initialize(Array[Double](y._1, y._2)) //1
      val nState = newState  //2
      (nState(0), nState(1)) }) //3
   …

The data transformation for the Kalman filter initializes the process and measurement model for each data point (line 1), updates the state using the transition and correction equations iteratively (line 2), and returns the filtered series (line 3).

Tip

Exception handling

The code to catch and process exceptions thrown by the Apache Commons Math library is omitted as the standard practice in the book. As far as the execution of the Kalman filter is concerned, the following exceptions have to be handled:

  • NonSquareMatrixException
  • DimensionMismatchException
  • MatrixDimensionMismatchException

The model is a 2-step lag smoothing algorithm using a single smoothing factor α with a state, St:

St = {xt+1, xt} with xt+1 = α.xt + (1- α).xt-1 and xt = xt

Following the Scala standard to return errors to the client code, the exceptions thrown by the Commons Math API are caught and processed through the Option monad. The iterative prediction and correction of the smoothed yields is implemented by the newState method. The method iterates through four steps:

  1. Filter an estimate of the state x at time t.
  2. The new state is computed using the transition equation.
  3. The measured value z of the state is computed using the measurement equation.
  4. The original estimate x is corrected with the measured value.

The newState method is defined as follows:

val PROCESS_NOISE_Q = 0.03
val PROCESS_NOISE_R = 0.1
val MEASUREMENT_NOISE = 0.4

def newState: DblVector = {
  Range(0, maxIters) foreach( _ => { 
    filter.predict  //1
    val w = qrNoise.create(PROCESS_NOISE_Q, PROCESS_NOISE_R)
    x = A.operate(x).add(qrNoise.noisyQ) //2
    val v = qrNoise.create(MEASUREMENT_NOISE)
    val z = H.operate(x).add(qrNoise.noisyR) //3
    filter.correct(z) // 4
  })
  filter.getStateEstimation
}

The PROCESS_NOISE factor (with respect to MEASUREMENT_NOISE) used in the creation of the process noise w and measurement noise v are somewhat arbitrary. Their purpose is to simulate the white noise for the model. The newState method returns the filtered state as a DblVector instance for this particular state.

Tip

The exit condition

In the code snippet for the newState method, the iteration for specific data points exits when the maximum number of iterations is reached. A more elaborate implementation consists of either evaluating the matrix P at each iteration or estimation converged within a predefined range.

Experimentation

The objective is to smoothen the yield of the 10-year Treasury bond and quantify the impact of the elements of the state-transition matrix A on the smoothing process. The state equation updates the values of the state [xt, xt-1] using the previous state [xt-1, xt-2], where x represents the yield at time t. This is accomplished by shifting the values of the original time series {x0, ... xn} by 1 using the drop method, X1={x1, … xn}, creating a copy of the original time series without the last element X2={x0, … xn-1} and zipping X1 and X2. The resulting sequence of pair {(xk, xk-1)} is processed by the Kalman algorithm, as shown in the following code:

implicit val qrNoise = QRNoise((0.2, 0.4), (m: Double) => m* (new Random(System.currentTimeMillis)).nextGaussian) //1
val A: DblMatrix = ((0.9, 0.0), (0.0, 0.1))
val B: DblMatrix = (0.0, 0.0) 
val H: DblMatrix  = (1.0, 1.0)
val P0: DblMatrix = ((0.4, 0.5), (0.4, 0.5))
val x0: DblVector = (175.0, 175.0)

val dKalman = new DKalman(A, B, H, P0) //2
val output = "output/chap3/kalman.csv"
val zt_1 = zSeries.drop(1)
val zt = zSeries.take(zSeries.size-1)
val filtered = dKalman |> XTSeries[(Double, Double)](zt_1.zip(zt)) //3
DataSink[Double](output) |> filtered.map(_._1) //4

The process and measurement noise qrNoise is implicitly initialized with the respective factors, 0.2 and 0.4 (line 1). The Kalman filter is initialized with the prediction-correction equation matrices A, B, H, and P0, and the initial state x0 (line 2). A time series {(xi, xi-p)}i is generated by zipping two copies of the historical 10 Treasury bond yield series, with the second one being shifted by p data. The Kalman filter is applied to the time series of tuples and the result is dumped into an output file using a DataSink instance (line 4)

The test is performed over a period of one year, and the results are plotted using a basis point or 100th of a percentage. The quality of the output is evaluated using two different values for the state transition matrix A: [0, 8, 0.2, 1.0, 0.0] and [0,5, 0.5, 1.0, 0.0].

Tip

Modeling state transition and noise

The state transition and the noise related to the process have to be selected carefully. The resolution of the state equations relies on the QR decomposition, which requires a non-negative definite matrix. The implementation in the Apache common library throws a NonPositiveDefiniteMatrixException if the principle is violated.

The smoothed yield is plotted along the raw data as follows:

Experimentation

The output of the Kalman filter for the 10-year Treasury bond historical prices

Clearly, the yield time series has been smoothed. However, the amplitude of the underlying trend is significantly higher than any of the noise or the spikes. Consequently, the Kalman filter has a limited impact. Let's analyze the data for a shorter period during which the noise is the strongest, between the 190th and the 275th trading days.

Experimentation

The output of the Kalman filter for the 10-year Treasury bond prices 0.8-0.2

The high frequency noise has been significantly reduced without cancelling the actual spikes. The distribution 0.8-0.2 takes into consideration the previous state and favors the predicted value. Contrarily, a run with a state transition matrix A [0.2, 0.8, 0.0, 1.0] that favors the latest measurement will preserve the noise, as seen in the following graph:

Experimentation

The output of the Kalman filter for the 10-year Treasury bond price 0.2-0.8

The Kalman filter is a very useful and powerful tool in understanding the distribution of the noise between the process and observation. Contrary to the low or pass-band filters based on the fast Fourier transform, the Kalman filter does not require computation of the frequencies spectrum or assume the range of frequencies of the noise.

However, the linear Kalman filter has its limitations:

  • The noise generated by both the process and the measurement has to be Gaussian. Processes with non-Gaussian noise can be modeled with techniques such as a Gaussian Sum filter or adaptive Gaussian mixture [3:14].
  • It requires that the underlying process is linear. Researchers have been able to formulate extensions to the Kalman filter, known as the extended Kalman filter (EKF) to filter signals from non-linear dynamic systems, at the cost of significant computational complexity.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset