Chapter 3
Univariate Data Analysis

Univariate data analysis studies univariate financial time series, but ignoring the time series properties of data. Univariate data analysis studies also cross-sectional data. For example, returns at a fixed time point of a collection of stocks is a cross-sectional univariate data set.

A univariate series of observations can be described using such statistics as sample mean, median, variance, quantiles, and expected shortfalls. These are covered in Section 3.1.

The graphical methods are explained in Section 3.2. Univariate graphical tools include tail plots, regression plots of the tails, histograms, and kernel density estimators. We use often tail plots to visualize the tail parts of the distribution, and kernel density estimates to visualize the central part of the distribution. The kernel density estimator is not only a visualization tool but also a tool for estimation.

We define univariate parametric models like normal, log-normal, and Student models in Section 3.3. These are parametric models, which are alternatives to the use of the kernel density estimator.

For a univariate financial time series it is of interest to study the tail properties of the distribution. This is done in Section 3.4. Typically the distribution of a financial time series has heavier tails than the normal distributions. The estimation of the tails is done using the concept of the excess distribution. The excess distribution is modeled with exponential, Pareto, gamma, generalized Pareto, and Weibull distributions. The fitting of distributions can be done with a version of maximum likelihood. These results prepare us to quantile estimation, which is considered in Chapter 8.

Central limit theorems provide tools to construct confidence intervals and confidence regions. The limit theorems for maxima provide insight into the estimation of the tails of a distribution. Limit theorems are covered in Section 3.5.

Section 3.6 summarizes the univariate stylized facts.

3.1 Univariate Statistics

We define mean, median, and mode to characterize the center of a distribution. The spread of a distribution can be measured by variance, other centered moments, lower and upper partial moments, lower and upper conditional moments, quantiles (value-at-risk), expected shortfall, shortfall, and absolute shortfall.

We define both population and sample versions of the statistics. In addition, we define both unconditional and conditional versions of the statistics.

3.1.1 The Center of a Distribution

The center of a distribution can be defined using the mean, the median, or the mode. The center of a distribution is an unknown quantity that has to be estimated using the sample mean, the sample median, or the sample mode. The conditional versions of theses quantities take into account the available information. For example, if we know that it is winter, then the expected temperature is lower than the expected temperature when we know that it is summer.

3.1.1.1 The Mean and the Conditional Mean

The population mean is called the expectation. The population mean can be estimated by the arithmetic mean. The conditional mean is estimated using regression analysis.

The Population Mean

The population mean (expectation) of random variable c03-math-001, whose distribution is continuous, is defined as

3.1 equation

where c03-math-003 is the density function of c03-math-004.1 Let c03-math-011 be an explanatory random variable (random vector). The conditional expectation of c03-math-012 given c03-math-013 can be defined by

equation

where c03-math-014 is the conditional density.2

The population mean of random variable c03-math-023, whose distribution is discrete with the possible values c03-math-024, is defined as

3.3 equation

The conditional expectation can be defined as

equation
The Sample Mean

Given a sample c03-math-026 from the distribution of c03-math-027, the mean c03-math-028 can be estimated with the sample mean (the arithmetic mean):

Regression analysis studies the estimation of the conditional expectation. In regression analysis, we observe values c03-math-030 of the explanatory random variable (random vector), in addition to observing values c03-math-031 of the response variable. Besides linear regression there exist various nonparametric methods for the estimation of the conditional expectation. For example, in kernel regression the arithmetic mean in (3.4) is replaced by a weighted mean

equation

where c03-math-032 is a weight that is large when c03-math-033 is close to c03-math-034 and small when c03-math-035 is far away from c03-math-036. Now c03-math-037 is an estimate of the conditional mean c03-math-038, for c03-math-039. Kernel regression and other regression methods are described in Section 6.1.2.

The Annualized Mean

The return of a portfolio is typically estimated using the arithmetic mean and it is expressed as the annualized mean return. Let c03-math-040 be observed stock prices, sampled at equidistant time points. Let c03-math-041, c03-math-042, be the net returns. Let the sampling interval be c03-math-043. The annualized mean return is

For the monthly returns c03-math-045. For the daily returns c03-math-046, because there are about 250 trading days in a year. Sampling of prices and several definitions of returns are discussed in Section 2.1.2.

The Geometric Mean

Let c03-math-047 be the observed stock prices and let c03-math-048, c03-math-049, be the gross returns. The geometric mean is defined as

equation

The logarithm of the geometric mean is equal to the arithmetic mean of the logarithmic returns:

equation

Note that c03-math-050 is the cumulative wealth at time c03-math-051 when we start with wealth 1. Thus,

equation

3.1.1.2 The Median and the Conditional Median

The median can be defined in the case of a continuous distribution function of a random variable c03-math-052 as the number c03-math-053 satisfying

equation

Thus, the median is the point that divides the probability mass into two equal parts. Let us define the distribution function c03-math-054 by

equation

When c03-math-055 is continuous, then

equation

In general, covering also the case of discrete distributions, we can define the median uniquely as the generalized inverse of the distribution function:

The conditional median is defined using the conditional distribution function

equation

where c03-math-057 is a random vector taking values in c03-math-058. Now we can define

3.7 equation

where c03-math-060.

The sample median of observations c03-math-061 can be defined as the observation that has as many smaller observations as larger observations:

3.8 equation

where c03-math-063 is the ordered sample and c03-math-064 is the largest integer smaller or equal to c03-math-065. The sample median is a special case of an empirical quantile. Empirical quantiles are defined in (8.21)–(8.23).

3.1.1.3 The Mode and the Conditional Mode

The mode is defined as an argument maximizing the density function of the distribution of a random variable:

3.9 equation

where c03-math-067 is the density function of the distribution of c03-math-068. The density c03-math-069 can have several local maxima, and the use of the mode seems to be interesting only in cases where the density function is unimodal (has one local maximum). The conditional mode is defined as an argument maximizing the conditional density:

equation

A mode can be estimated by finding a maximizer of a density estimate:

equation

where c03-math-070 is an estimator of the density function c03-math-071. Histograms and kernel density estimators are defined in Section 3.2.2.

3.1.2 The Variance and Moments

Variance and higher order moments characterize the dispersion of a univariate distribution. To take into account only the left or the right tail we define upper and lower partial moments and upper and lower conditional moments.

3.1.2.1 The Variance and the Conditional Variance

The variance of random variable c03-math-072 is defined by

3.10 equation

The standard deviation of c03-math-074 is the square root of the variance of c03-math-075. The conditional variance of random variable c03-math-076 is equal to

3.11 equation
3.12 equation

The conditional standard deviation of c03-math-079 is the square root of the conditional variance.

The Sample Variance

The sample variance is defined by

3.13 equation

where c03-math-081 is a sample of random variables having identical distribution with c03-math-082, and c03-math-083 is the sample mean.3

The Annualized Variance

The sample variance and the standard deviation of portfolio returns are typically annualized, analogously to the annualized sample mean in (3.5). Let c03-math-088 be the observed stock prices, sampled at equidistant time points. Let c03-math-089, c03-math-090, be the net returns. Let the sampling interval be c03-math-091. The annualized sample variance of the returns is

equation

where c03-math-092. For the monthly returns c03-math-093. For the daily returns c03-math-094, because there are about 250 trading days in a year. Sampling of prices and several definitions of returns are discussed in Section 2.1.2.

3.1.2.2 The Upper and Lower Partial Moments

The definition of the variance of random variable c03-math-095 can be generalized to other centered moments

equation

for c03-math-096. The variance is obtained when c03-math-097. The centered moments take a contribution both from the left and the right tail of the distribution. The lower partial moments take a contribution only from the left tail and the upper partial moments take a contribution only from the right tail. For example, if we are interested only in the distribution of the losses, then we use the lower partial moments of the return distribution, and if we are interested only in the distribution of the gains, then we use the upper partial moments. The upper partial moment is defined as

3.14 equation

where c03-math-099, c03-math-100, and c03-math-101. The lower partial moment is defined as

When c03-math-103 has density c03-math-104, we can write

equation

For example, when c03-math-105, then

equation

so that the upper partial moment is equal to the probability that c03-math-106 is greater or equal to c03-math-107, and the lower partial moment is equal to the probability that c03-math-108 is smaller or equal to c03-math-109. For c03-math-110 and c03-math-111 the partial moments are called the upper and lower semivariance of c03-math-112. For example, the lower semivariance is defined as

3.16 equation

The square root of the lower semivariance can be used to replace the standard deviation in the definition of the Sharpe ratio, or in the Markowitz criterion.

The sample centered moments are

equation

where c03-math-114 is the sample mean. The sample upper and the sample lower partial moments are

For example, when c03-math-116 we have

equation

where

3.1.2.3 The Upper and Lower Conditional Moments

The upper conditional moments are the moments conditioned on the right tail of the distribution and the lower conditional moments are the moments conditioned on the left tail of the distribution. The upper conditional moment is defined as

equation

and the lower conditional moment is defined as

where c03-math-119 and c03-math-120 is a target rate.

The sample lower conditional moment is

where c03-math-122 is defined in (3.18). Note that in (3.17) the sample size is the denominator but in (3.20) we have divided with the number of observations in the left tail.

We can condition also on an external variable c03-math-123 and define conditional on c03-math-124 versions of both upper and lower moments, and upper and lower conditional moments.

3.1.3 The Quantiles and the Expected Shortfalls

The quantiles are applied under the name value-at-risk in risk management to characterize the probability of a tail event. The expected shortfall is a related measure for a tail risk.

3.1.3.1 The Quantiles and the Conditional Quantiles

The c03-math-125th quantile is defined as

where c03-math-127 and c03-math-128 is the distribution function of c03-math-129. The value-at-risk is defined in (8.3) as a quantile of a loss distribution. For c03-math-130, c03-math-131 is equal to c03-math-132, defined in (3.6). In the case of a continuous distribution function, we have

equation

and thus it holds that

equation

where c03-math-133 is the inverse of c03-math-134. The c03-math-135th conditional quantile is defined replacing the distribution function of c03-math-136 with the conditional distribution function of c03-math-137:

3.22 equation

where c03-math-139 and c03-math-140 is the conditional distribution function of c03-math-141.

The empirical quantile is defined as

3.23 equation

where c03-math-143 is the ordered sample and c03-math-144 is the smallest integer c03-math-145. We give equivalent definitions of the empirical quantile in Section 8.4.1. Chapter 8 discusses various estimators of quantiles and conditional quantiles.

3.1.3.2 The Expected Shortfalls

The expected shortfall is a measure of risk that aggregates all quantiles in the right tail (or in the left tail). When c03-math-146 has a continuous distribution function, then the expected shortfall for the right tail is

where c03-math-148. Thus, the c03-math-149th expected shortfall is the conditional expectation under the condition that the random variable is larger than the c03-math-150th quantile. The term “tail conditional value-at-risk” is sometimes used to denote the expected shortfall. In the general case, when the distribution of c03-math-151 is not necessarily continuous, the expected shortfall for the right tail is defined as

The equality of (3.24) and (3.25) for the continuous distributions is proved in McNeil et al. (2005, lemma 2.16). In fact, denoting c03-math-153,

equation

where c03-math-154 and we use the fact that c03-math-155.4 Finally, note that c03-math-157 for continuous distributions.

The expected shortfall for the left tail is

equation

When c03-math-158 has a continuous distribution function, then the expected shortfall for the left tail is

This expression shows that in the case of a continuous distribution function, c03-math-160 is equal to the expectation that is taken only over the left tail, when the left tail is defined as the region that is on the left side of the c03-math-161th quantile of the distribution. Note that the expected shortfall for the left tail is related to the lower conditional moment of order c03-math-162 and target rate c03-math-163:

equation

where the lower conditional moment c03-math-164 is defined in (3.19).5

The expected shortfall for the right tail, as defined in (3.24), can be estimated from the data c03-math-172 by

where c03-math-174 and c03-math-175, with, for example, c03-math-176 or c03-math-177. When the expected shortfall is for the left tail, as defined by (3.26), then we define the estimator as

where c03-math-179 with, for example, c03-math-180 or c03-math-181.

3.2 Univariate Graphical Tools

We consider sequence c03-math-182 of real numbers, and assume that the sequence is a sample from a probability distribution. We want to visualize the sequence in order to discover properties of the underlying distribution. We divide the graphical tools to those that are based on the empirical distribution function and the empirical quantiles, and to those that are based on the estimation of the underlying density function. The distribution function and quantiles based tools give more insight about the tails of the distribution, and the density based tools give more information about the center of the distribution.

A two-variate data can be visualized using a scatter plot. For a univariate data there is no such obvious method available. Thus, visualizing two-variate data may seem easier than visualizing univariate data. However, we can consider many of the tools to visualize univariate data to be scatter plots of points

where c03-math-184 is a mapping that attaches a real value to each data point c03-math-185. Thus, in a sense we visualize univariate data by transforming it into a two-dimensional data.

3.2.1 Empirical Distribution Function Based Tools

The distribution function of the distribution of random variable c03-math-186 is

equation

The empirical distribution function can be considered as a starting point for several visualizations: tail plots, regression plots of tails, and empirical quantile functions. We use often tail plots. Regression plots of tails have two types: (1) plots that look linear for an exponential tail and (2) plots that look linear for a Pareto tail.

3.2.1.1 The Empirical Distribution Function

The empirical distribution function c03-math-187, based on data c03-math-188, is defined as

where c03-math-190, and c03-math-191 means the cardinality of set c03-math-192. Note that the empirical distribution function is defined in (8.20) using the indicator function. An empirical distribution function is a piecewise constant function. Plotting a graph of an empirical distribution function is for large samples practically the same as plotting the points

where c03-math-194 are the ordered observations. Thus, the empirical distribution function fits the scheme of transforming univariate data to two-dimensional data as in (3.29).

Figure 3.1 shows empirical distribution functions of S&P 500 net returns (red) and 10-year bond net returns (blue). The monthly data of S&P 500 and US Treasury 10-year bond returns is described in Section 2.4.3. Panel (a) plots the points (3.31) and panel (b) zooms to the lower left corner, showing the empirical distribution function for the c03-math-195 smallest observations; the empirical distribution function is shown on the range c03-math-196, where c03-math-197 is the c03-math-198th empirical quantile for c03-math-199. Neither of the estimated return distributions dominates the other: The S&P 500 distribution function is higher at the left tail but lower at the right tail. That is, S&P 500 is more risky than the 10-year bond. Note that Section 9.2.3 discusses stochastic dominance: a first return distribution dominates stochastically a second return distribution when the first distribution function takes smaller values everywhere than the second distribution function.

Graphical illustration of Empirical distribution functions.

Figure 3.1 Empirical distribution functions. (a) Empirical distribution functions of S&P 500 returns (red) and 10-year bond returns (blue); (b) zooming at the lower left corner.

3.2.1.2 The Tail Plots

The left and right tail plots can be used to visualize the heaviness of the tails of the underlying distribution. A smooth tail plot can be used to visualize simultaneously a large number of samples. The tail plots are almost the same as the empirical distribution function, but there are couple of differences:

  1. 1. In tail plots we divide the data into the left tail and the right tail, and we visualize separately the two tails.
  2. 2. In tail plots the c03-math-200-axis shows the number of observations and a logarithmic scale is used for the c03-math-201-axis.

Tail plots have been applied in Mandelbrot (1963), Bouchaud and Potters (2003), and Sornette (2003).

The Left and the Right Tail Plots

The observations in the left tail are

equation

where c03-math-202 is the c03-math-203th empirical quantile for c03-math-204. For the left tail plot we choose the level

Thus, the smallest observation has level one, the second smallest observation has level two, and so on. Note that c03-math-206 is often called the rank of c03-math-207. The left tail plot is the two-dimensional scatter plot of the points c03-math-208, c03-math-209, when the logarithmic scale is used for the c03-math-210-axis.

The observations in the right tail are

equation

where c03-math-211 is the c03-math-212th empirical quantile for c03-math-213. We choose the level of c03-math-214 as the number of observations larger or equal to c03-math-215:

Thus, the largest observation has level one, the second largest observation has level two, and so on. The right tail plot is the two-dimensional scatter plot of the points c03-math-217, c03-math-218, when the logarithmic scale is used for the c03-math-219-axis.

The left tail plot can be considered as an estimator of the function

where c03-math-221 is the underlying distribution function and c03-math-222. Indeed, for the level in (3.32) we have that c03-math-223. The right tail plot can be considered as an estimator of the function

where c03-math-225. For the level in (3.33) we have that c03-math-226.

Figure 3.2 shows the left and right tail plots for the daily S&P 500 data, described in Section 2.4.1. Panel (a) shows the left tail plot and panel (b) shows the right tail plot. The black circles show the data points. The c03-math-227-axis is logarithmic. The colored curves show the population versions (3.34) and (3.35) for the Gaussian distribution (red) and for the Student distributions with degrees of freedom c03-math-228 (blue).6 We can see that for the left tail Student's distribution with degrees of freedom c03-math-245 gives the best fit, but for the right tail degrees of freedom c03-math-246 gives the best fit.

Graphical illustration of Left and right tail plots.

Figure 3.2 Left and right tail plots. (a) The left tail plot for S&P 500 returns; (b) the right tail plot. The red curve shows the theoretical Gaussian curve and the blue curves show the Student curves for the degrees of freedom ν = 3–6.

A left tail plot and a right tail plot can be combined into one figure, at least when both the left and the right tails are defined by taking the threshold to be the sample median c03-math-247 (see Figures 14.24(a) and 14.25(a)).

Smooth Tail Plots

Figure 3.3 shows smooth tail plots for the S&P 500 components data, described in Section 2.4.5. Panel (a) shows left tail plots and panel (b) shows right tail plots. The gray scale image visualizes with one picture all tail plots of the stocks in the S&P 500 components data. The red points show the tail plots of S&P 500 index, which is also shown in Figure 3.2. Note that the c03-math-248-axes have the ranges c03-math-249 and c03-math-250, so that the extreme observations are not shown. Note that instead of the logarithmic scale of c03-math-251-values c03-math-252, we have used values c03-math-253 on the c03-math-254-axis. We can see that the index has lighter tails than most of the individual stocks.

Graphical illustration of Smooth tail plots.

Figure 3.3 Smooth tail plots. The gray scale images show smooth tail plots of a collection of stocks in the S&P 500 index. The red points show the tail plots of the S&P 500 index. (a) A smooth left tail plot; (b) a smooth right tail plot.

In a smooth tail plot we make an image that simultaneously shows several tail plots. Let us have c03-math-255 stocks and c03-math-256 returns for each stock. We draw a separate left or right tail plot for each stock. Plotting these tail plots in the same figure would cause overlapping, and we would see only a black image. That is why we use smoothing. We divide the c03-math-257-axis to 300 grid points, say. The c03-math-258-axis has c03-math-259 grid points. Thus, we have c03-math-260 pixels. For each c03-math-261-value we compute the value of a univariate kernel density estimator at that c03-math-262-value. Each kernel estimator is constructed using c03-math-263 observations. This is done for each c03-math-264 rows, so that we evaluate c03-math-265 estimates at 300 points. See Section 3.2.2 about kernel density estimation. We choose the smoothing parameter using the normal reference rule and use the standard Gaussian kernel. The values of the density estimate are raised to the power of 21 before applying the gray scale.

3.2.1.3 Regression Plots of Tails

Regression plots are related to the empirical distribution function, just like tail plots, but now the data is transformed so that it lies on c03-math-266, both in the case of the left tail and in the case of the right tail. We use the term “regression plot” because these plots suggest fitting linear regression curves to the data. We distinguish the plot for which exponential tails looks linear and the plot for which Pareto tails look linear.

Plots which Look Linear for an Exponential Tail

Let the original observations be c03-math-267. Let c03-math-268 be a threshold. We choose c03-math-269 to be an empirical quantile c03-math-270 for some c03-math-271: c03-math-272 for c03-math-273, where c03-math-274 are the ordered observations. Let c03-math-275 be the left tail and c03-math-276 be the right tail, transformed so that the observations lie on c03-math-277:

equation

For the left tail c03-math-278 for c03-math-279 and for the right tail c03-math-280 for c03-math-281. Let us denote by c03-math-282 either the left tail or the right tail. Denote

equation

Let

equation

be the empirical distribution function, based on data c03-math-283. Note that in the usual definition of the empirical distribution function we divide by c03-math-284, but now we divide by c03-math-285 because we need that c03-math-286, in order to take the logarithm of c03-math-287. Denote

equation

Assume that the data is ordered:

equation

We have that

equation

The regression plot that is linear for exponential tails is a scatter plot of the points7

Figure 3.4 shows scatter plots of points in (3.36). We use the S&P 500 daily data, described in Section 2.4.1. Panel (a) plots data in the left tail with c03-math-293 (black), c03-math-294 (red), and c03-math-295 (blue). Panel (b) plots data in the right tail with c03-math-296 (black), c03-math-297 (red), and c03-math-298 (blue).

Graphical illustration of Regression plots which are linear for exponential tails for S&P 500 daily returns.

Figure 3.4 Regression plots which are linear for exponential tails: S&P 500 daily returns. (a) Left tail with c03-math-299 (black), c03-math-300 (red), and c03-math-301 (blue); (b) right tail with c03-math-302 (black), c03-math-303 (red), and c03-math-304 (blue).

The data looks linear for exponential tails and convex for Pareto tails. The exponential distribution function is c03-math-305 for c03-math-306, where c03-math-307. The exponential distribution function satisfies

equation

Plotting the curve

for c03-math-309 and for various values of c03-math-310 shows how well the exponential distributions fit the tail. The Pareto distribution function for the support c03-math-311 is c03-math-312 for c03-math-313, where c03-math-314; see (3.74). The Pareto distribution function satisfies

equation

Plotting the curve

for c03-math-316 and for various values of c03-math-317 shows how well the Pareto distributions fit the tail.8

Figure 3.5 shows how parametric models are fitted to the left tail, defined by the c03-math-320th empirical quantile with c03-math-321. We use the S&P 500 daily data, as described in Section 2.4.1. Panel (a) shows fitting of exponential tails: we show functions (3.37) for three values of parameter c03-math-322. Panel (a) shows fitting of Pareto tails: we show functions (3.38) for three values of parameter c03-math-323. The middle values of the parameters are the maximum likelihood estimates, defined in Section 3.4.2.

Graphical illustration of Fitting of parametric families for data thath is linear for exponential tails.

Figure 3.5 Fitting of parametric families for data that is linear for exponential tails. The data points are from left tail of S&P 500 daily returns, defined by the c03-math-324th empirical quantile with c03-math-325. (a) Fitting of exponential distributions; (b) fitting of Pareto distributions.

Plots which Look Linear for a Pareto Tail

Let

equation

For the right tail we assume that c03-math-326 and for the left tail we assume that c03-math-327. Let us denote by c03-math-328 either the left tail or the right tail. Denote

equation

Assume that the data is ordered: c03-math-329 The regression plot that is linear for Pareto tails is a scatter plots of the points

Figure 3.6 shows scatter plots of points in (3.39). We use the S&P 500 daily data, described in Section 2.4.1. Panel (a) plots data in the left tail with c03-math-331 (black), c03-math-332 (red), and c03-math-333 (blue). Panel (b) plots data in the right tail with c03-math-334 (black), c03-math-335 (red), and c03-math-336 (blue).

Graphical illustration of Regression plots which are linear for Pareto tails for S&P 500 daily returns.

Figure 3.6 Regression plots which are linear for Pareto tails: S&P 500 daily returns. (a) Left tail with c03-math-337 (black), c03-math-338 (red), and c03-math-339 (blue); (b) right tail with c03-math-340 (black), c03-math-341 (red), and c03-math-342 (blue).

The data looks linear for Pareto tails and concave for exponential tails. The exponential distribution function for the support c03-math-343 is c03-math-344 for c03-math-345, where c03-math-346. The exponential distribution function satisfies

equation

Plotting the curve

equation

for c03-math-347 and for various values of c03-math-348 shows how well the exponential distributions fit the tail. The Pareto distribution function for the support c03-math-349 is c03-math-350 for c03-math-351, where c03-math-352. The Pareto distribution function satisfies

equation

Plotting the curve

equation

for c03-math-353 and for various values of c03-math-354 shows how well the Pareto distributions fit the tail.

Figure 3.7 shows how parametric models are fitted to the left tail, defined by the c03-math-355th empirical quantile with c03-math-356. We use the S&P 500 daily data, described in Section 2.4.1. Panel (a) shows fitting of exponential tails: we show functions (3.37) for three values of parameter c03-math-357. Panel (a) shows fitting of Pareto tails: we show functions (3.38) for three values of parameter c03-math-358. The middle values of the parameters are the maximum likelihood estimates, defined in Section 3.4.2.

Graphical illustration of Fitting of parametric families for data that is linear for Pareto tails.

Figure 3.7 Fitting of parametric families for data that is linear for Pareto tails. The data points are from left tail of S&P 500 daily returns, defined by the c03-math-359th empirical quantile with c03-math-360. (a) Fitting of exponential distributions; (b) fitting of Pareto distributions.

3.2.1.4 The Empirical Quantile Function

The c03-math-361th quantile of the distribution of the random variable c03-math-362 is defined in (3.21) as

equation

where c03-math-363 and c03-math-364 is the distribution function of c03-math-365. The empirical quantile can be defined as

equation

where c03-math-366 is the empirical distribution function, as defined in (3.30); see (8.21). Section 8.4.1 contains equivalent definitions of the empirical quantile.

The quantile function is

equation

For continuous distributions the quantile function is the same as the inverse of the distribution function. The empirical quantile function is

3.40 equation

where c03-math-368 is the empirical quantile. A quantile function can be used to compare return distributions. A first return distribution dominates a second return distribution when the first quantile function takes higher values everywhere than the second quantile function. See Section 9.2.3 about stochastic dominance.

Plotting a graph of the empirical quantile function is close to plotting the points

where c03-math-370 are the ordered observations.

Figure 3.8 shows empirical quantile functions of S&P 500 returns (red) and 10-year bond returns (blue). The monthly data of S&P 500 and US Treasury 10-year bond returns is described in Section 2.4.3. Panel (a) plots the points (3.41) and panel (b) zooms at the lower left corner, showing the empirical quantile on the range c03-math-371. Neither of the estimated return distributions dominates the other: The S&P 500 returns have a higher median and higher upper quantiles, but they have smaller lower quantiles. That is, S&P 500 is more risky than 10-year bond.

Graphical illustration of Empirical quantile functions.

Figure 3.8 Empirical quantile functions. (a) Empirical quantile functions of S&P 500 returns (red) and 10-year bond returns (blue); (b) zooming to the lower left corner.

3.2.2 Density Estimation Based Tools

We describe both histograms and kernel density estimators.

3.2.2.1 The Histogram

A histogram estimator of the density of c03-math-372, based on identically distributed observations c03-math-373, is defined as

3.42 equation

where c03-math-375 is a partition on c03-math-376 and

equation

is the number of observations in c03-math-377. The partition is a collection of sets c03-math-378 that are (almost surely) disjoint and they cover the space of the observed values c03-math-379.9

Figure 3.9(a) shows a histogram estimate using S&P 500 returns. We use the S&P 500 monthly data, described in Section 2.4.3. The histogram is constructed from the data c03-math-382, c03-math-383, where c03-math-384 are the monthly gross returns. Panel (b) shows a histogram constructed from the historically simulated pay-offs of the call option with the strike price 100. The histogram is constructed from the data c03-math-385, c03-math-386. Panel (a) includes a graph of a kernel density estimate, defined in (3.43). The histogram in panel (b) illustrates that a histogram is convenient to visualize the density of data that is not from a continuous distribution; for this data the value 0 has a probability about 0.5.

Graphical illustration of Histogram estimates. (a) A histogram of historically simulated S&P 500 prices. (b) A histogram of historically simulated call option pay-offs.

Figure 3.9 Histogram estimates. (a) A histogram of historically simulated S&P 500 prices. A graph of kernel density estimate is included. (b) A histogram of historically simulated call option pay-offs.

3.2.2.2 The Kernel Density Estimator

The kernel density estimator c03-math-387 of the density function c03-math-388 of random vector c03-math-389, based on identically distributed data c03-math-390, is defined by

where c03-math-392 is the kernel function, c03-math-393, and c03-math-394 is the smoothing parameter.10

We can also take the vector smoothing parameter c03-math-401 and c03-math-402. The smoothing parameter of the kernel density estimator can be chosen using the normal reference rule:

3.44 equation

for c03-math-404, where c03-math-405 is the sample standard deviation for the c03-math-406th variable; see Silverman (1986, p. 45). Alternatively, the sample variances of the marginal distributions can be normalized to one, so that c03-math-407.

Figure 3.10(a) shows kernel estimates of the distribution of S&P 500 monthly net returns (blue) and of the distribution of US 10-year bond monthly net returns (red). The data set of monthly returns of S&P 500 and US 10-year bond is described in Section 2.4.3. Panel (b) shows kernel density estimates of S&P 500 net returns with periods of 1–5 trading days (colors black–green). We use S&P 500 daily data of Section 2.4.1 to construct returns for the different horizons.

Graphical illustration of Kernel density estimates of distributions of asset returns.

Figure 3.10 Kernel density estimates of distributions of asset returns. (a) Estimates of the distribution of S&P 500 monthly returns (blue) and of US 10-year bond monthly returns (red); (b) estimates of S&P 500 net returns with periods of 1–5 trading days (colors black–green).

3.3 Univariate Parametric Models

We describe normal and log-normal distributions, Student distributions, infinitely divisible distributions, Pareto distributions, and models that interpolate between exponential and polynomial tails. We consider also the estimation of the parameters, in particular, the estimation of the tail index.

3.3.1 The Normal and Log-normal Models

After defining the normal and log-normal distributions, we discuss how the central limit theorem can be used to justify that these distributions can be used to model stock prices.

3.3.1.1 The Normal and Log-normal Distributions

A univariate normal distribution can be parameterized with the expectation c03-math-408 and the standard deviation c03-math-409. When c03-math-410 is a random variable with a normal distribution we write

equation

The density of the normal distribution c03-math-411 is

equation

where c03-math-412. The parameters c03-math-413 and c03-math-414 can be estimated by the sample mean and sample standard deviation.

When c03-math-415, then it is said that c03-math-416 has a log-normal distribution, and we write

equation

The density function of a log-normal distribution is

3.45 equation

where c03-math-418. Thus, log-normally distributed random variables are positive (almost surely). The expectation of a log-normally distributed random variable c03-math-419 is

equation

For c03-math-420, c03-math-421. Given observations c03-math-422 from a log-normal distribution, the parameters c03-math-423 and c03-math-424 can be estimated using the sample mean and sample standard deviation computed from the observations c03-math-425.

Note that a linear combination of log-normal variables is not log-normally distributed, but a product of log-normally distributed random variables is log-normally distributed, because a linear combination of normal variables is normally distributed.

3.3.1.2 Modeling Stock Prices

We can justify heuristically the normal distribution for the differences of stock prices using the central limit theorem. The central limit theorem can also be used to justify the log-normal model for the gross returns (which amounts to a normal model for the logarithmic returns). Let us consider time interval c03-math-426 and let c03-math-427 for c03-math-428, so that c03-math-429 is an equally spaced sample of stock prices, where c03-math-430 and c03-math-431. The time interval between the sampled prices is c03-math-432.

  1. 1. Normal model. We may write the price at time c03-math-433, c03-math-434, as
  2. If the price increments c03-math-436 are i.i.d. with expectation c03-math-437 and variance c03-math-438, then an application of the central limit theorem gives the approximation11
  3. where c03-math-444, and c03-math-445. Equation (3.47) defines the Gaussian model for the asset prices. Under the normal model we have
    3.48 equation
  4. where c03-math-447 is a random variable that has the standard normal distribution.
  5. 2. Log-normal model. We may write the asset price at time c03-math-448, c03-math-449, as
  6. If c03-math-451 are i.i.d. with expectation c03-math-452 and variance c03-math-453, then an application of the central limit theorem gives the approximation12
  7. where
  8. This is equivalent to saying that c03-math-460 is log-normally distributed with parameters c03-math-461 and c03-math-462:
    equation
  9. Equation (3.50) defines the log-normal model for the asset prices. Under the log-normal model we have
    3.52 equation
  10. where c03-math-464 is a random variably that has the standard normal distribution.

Parameter c03-math-465 in (3.51) is called the annualized mean of the logarithmic returns and parameter c03-math-466 is called the annualized volatility. For the daily data c03-math-467 and for the monthly data c03-math-468, when we take c03-math-469.

Figure 3.11 shows estimates of the densities of stock price c03-math-470 using the data of S&P 500 daily prices, described in Section 2.4.1. In panel (a) c03-math-471, which equals 20 trading days, and in panel (b) c03-math-472 years. The normal density is shown with black and the log-normal density is shown with red. We take c03-math-473, and for the purpose of fitting a normal distribution for the price increments we change the price data to c03-math-474. For the normal model the estimate c03-math-475 is the sample mean and c03-math-476 is the sample standard deviation of the daily increments. Then we arrive at the distribution

equation

where c03-math-477. For the log-normal model the estimate c03-math-478 is the sample mean and c03-math-479 is the sample standard deviation of the logarithmic daily returns. Then we arrive at the distribution

equation

The log-normal density is skewed to the left and the right tail is heavier than the left tail. The normal density is symmetric with respect to the mean.

Graphical illustration of Normal and log-normal densities.

Figure 3.11 Normal and log-normal densities. Shown are a normal density (black) and a log-normal density (red) of the distribution of the stock price c03-math-480, when c03-math-481. In panel (a) c03-math-482, which equals 20 trading days, and in panel (b) c03-math-483 years.

Log-normally distributed random variables take only positive values, but normal random variables can take negative values. Note, however, that the tail of the normal distribution is so thin that the probability of negative values can be very small. Thus, the positivity of log-normal distributions is not a strong argument in favor of their use to model prices.

The Gaussian model for the increments of the stock prices was used by Bachelier (1900). The continuous time limit of the log-normal model is the Black–Scholes model, that is used in option pricing. The log-normal model is applied in (14.49) to derive a price for options. A log-normal distribution allows for greater upside price movements than downside price movements. This leads to the fact that in the Black–Scholes model 105 call has more value than 95 put when the stock is at 100. See Figure 14.4 for the illustration of the asymmetry.

3.3.2 The Student Distributions

The density of the standard Student distribution with degrees of freedom c03-math-484 is given by

for c03-math-486, where the normalization constant is equal to

equation

and the gamma function is defined by c03-math-487 for c03-math-488. When c03-math-489 follows the Student distribution with degrees of freedom c03-math-490, then we write

equation

3.3.2.1 Properties of Student Distributions

Let c03-math-491. If c03-math-492 then c03-math-493 and c03-math-494. If c03-math-495, then

We have that c03-math-497 only when c03-math-498. In fact, a Student density has tails

as c03-math-500.13 Thus, Student densities have Pareto tails, as defined in Section 3.4.

We can consider three-parameter location-scale Student families. When c03-math-503, then c03-math-504 follows a location-scale Student distribution, and we write14

equation

Note that for c03-math-512, c03-math-513 but c03-math-514 is not the variance of c03-math-515. Instead,

equation

due to (3.54).15

When c03-math-522, then the Student density approaches the Gaussian density. Indeed, c03-math-523, as c03-math-524, since c03-math-525, when c03-math-526.

A student distributed random variable c03-math-527 can be written as

equation

where c03-math-528, and c03-math-529 has c03-math-530-distribution with degrees of freedom c03-math-531. Thus, Student distributions belong to the family of normal variance mixture distributions (scale-mixtures of normal distribution), as defined in Section 4.3.3.

3.3.2.2 Estimation of the Parameters of a Student Distribution

Let us observe c03-math-532 from a Student distribution c03-math-533 with the density function c03-math-534. The maximum likelihood estimates are maximizers of the likelihood over c03-math-535, c03-math-536, and c03-math-537. Equivalently, we can minimize the negative log-likelihood. Assuming the independence of the observations, the negative log-likelihood is equal to

equation

We apply the restricted maximum likelihood estimator that minimizes

3.56 equation

over c03-math-539 and c03-math-540, where c03-math-541 is the sample mean.

Figure 3.12 studies how the return horizon affects the maximum likelihood estimates for the Student family. We consider the data of daily S&P 500 returns, described in Section 2.4.1. The data is used to consider return horizons up to 40 days. Panel (a) shows the estimates of parameter c03-math-542 as a function of return horizon in trading days. Panel (b) shows the estimates of c03-math-543 as a function of the return horizon. We see that the estimates are larger for the longer return horizons but there is fluctuation in the estimates.

Graphical illustration of Parameter estimates for various return horizons.

Figure 3.12 Parameter estimates for various return horizons. The maximum likelihood estimates of (a) c03-math-544 and (b) c03-math-545 as a function of the return horizon in trading days.

Figure 3.13 shows the estimates of the degrees of freedom and the scale parameter for each series of daily returns in the S&P 500 components data, described in Section 2.4.5. We get an individual estimate of c03-math-546 and c03-math-547 for each stock. Panel (a) shows a kernel density estimate and a histogram estimate of the distribution of c03-math-548. Panel (b) shows the estimates of the distribution of c03-math-549.16 The maximizers of the kernel estimates (modes) are indicated by the blue lines. The most stocks has c03-math-551, but the estimates vary as c03-math-552.

Image described by caption and surrounding text.

Figure 3.13 Distribution of estimates c03-math-553 and c03-math-554. (a) A kernel density estimate and a histogram of the distribution of c03-math-555; (b) the estimates of the distribution of c03-math-556. The maximizers of the kernel estimates are indicated by the blue lines.

3.4 Tail Modeling

The normal, log-normal, and Student distributions provide models for the complete return distribution. These models assume that the return distribution is approximately symmetric. We consider an approach where the left tail, the right tail, and the central area are modeled and estimated separately. There are at least two advantages with this approach:

  1. 1. We may better estimate distributions whose left tail is different from the right tail. For example, it is possible that the distribution of losses is different from the distribution of gains.
  2. 2. We may apply different estimation methods for different parts of the distribution. For example, we may apply nonparametric methods for the estimation of the central part of the distribution and parametric methods for the estimation of the tails.

In risk management, we are mainly interested in the estimation of the left tail (the probability of losses). In portfolio selection, we might be interested in the complete distribution.

A semiparametric approach for the estimation of the complete return distribution estimates the left and the right tails of the distribution using a parametric model, but the central region of the distribution is estimated using a kernel estimator, or some other nonparametric density estimator. It is a nontrivial problem to make a good division of the support of the distribution into the area of the left tail, into the area of the right tail, and into the central area.

3.4.1 Modeling and Estimating Excess Distributions

We model the left and the right tails of a return distribution parametrically. The estimation of the parameters can be done using maximum likelihood, or by a regression method, for example.

3.4.1.1 Modeling Excess Distributions

Let c03-math-557 be a parameterized family of density functions whose support is c03-math-558. This family will be used to model the tails of the density c03-math-559 of the returns.

To estimate the right tail, we assume that the density function c03-math-560 of the returns satisfies

for some c03-math-562, where c03-math-563 is the c03-math-564th quantile of the return density: c03-math-565, and the probability c03-math-566 satisfies c03-math-567.17 To estimate the left tail we assume that the density function c03-math-574 of the returns satisfies

for some c03-math-576, where c03-math-577 is the c03-math-578th quantile of the return density: c03-math-579, and c03-math-580.

The assumptions can be expressed using the concept of the excess distribution with threshold c03-math-581. Let c03-math-582 be the distribution function of the returns and let c03-math-583 be the density function of the returns. Let c03-math-584 be the return. Now c03-math-585. The distribution function of the excess distribution with threshold c03-math-586 is

3.59 equation

The density function of the excess distribution with threshold c03-math-588 is

3.60 equation

Thus, the assumption in (3.57) says that

equation

for some c03-math-590. Limit theorems for threshold exceedances are discussed in Section 3.5.2.

Figure 3.14 illustrates the definition of an excess distribution. Panel (a) shows the density function of c03-math-591-distribution with degrees of freedom five. The green, blue, and red vectors indicate the location of quantiles c03-math-592 for c03-math-593, c03-math-594, and c03-math-595. Panel (b) shows the right excess distributions for c03-math-596. The choice of the threshold c03-math-597 affects the goodness-of-fit, and this issue will be addressed in the following sections.

Graphical illustration of Excess distributions.

Figure 3.14 Excess distributions. (a) The density function of c03-math-598-distribution with degrees of freedom five. The green, blue, and red vectors indicate the location of quantiles c03-math-599 for c03-math-600, c03-math-601, and c03-math-602. (b) The right excess distributions for c03-math-603.

3.4.1.2 Estimation

Estimation is done by first identifying the data coming from the left tail, and the data coming from the right tail. Second, the data is transformed onto c03-math-604. Third, we can apply any method of fitting parametric models.

Identifying the Data in the Tails

We choose threshold c03-math-605 of the excess distribution to be an estimate of the c03-math-606th quantile. For the estimation of the left tail we need to estimate the c03-math-607th quantile for c03-math-608, and for the estimation of the right tail we need to estimate the c03-math-609th quantile for c03-math-610. The data in the left tail and the right tail are

where c03-math-612 are estimates of a lower and an upper quantile, respectively. We use the empirical quantile to estimate the population quantile. Let c03-math-613 be the sample from the distribution of the returns, and let c03-math-614 be the ordered sample. The empirical quantile is

equation

where c03-math-615 is the integer part of c03-math-616. See Section 3.1.3 and Chapter 8 for more information about quantile estimation. Now the data in the left tail and the right tail can be written as

The Basic Principle of Fitting Tail Models

Assume that we have an estimation procedure for the estimation of the parameter c03-math-618 of the family c03-math-619, c03-math-620. The family consists of densities whose support is c03-math-621, and it is used to model the left or the right part of the density, as written in assumptions (3.58) and (3.57). We need a procedure for the estimation of the parameter c03-math-622 in model (3.58), or the parameter c03-math-623 in model (3.57). We apply the estimation procedure for estimating c03-math-624 using data

equation
Maximum Likelihood in Tail Estimation

We use the method of maximum likelihood for the estimation of the tails under the assumptions (3.57) and (3.58). We write the likelihood function under the assumption of independent and identically distributed observations, but we apply the maximum likelihood estimator for time series data. Thus, the method may be called pseudo maximum likelihood. Time series properties will be taken into account in Chapter 8, where quantile estimation is studied using tail modeling. The likelihood is maximized separately using the data in the left tail and in the right tail.

The family c03-math-625, c03-math-626, models the excess distribution. The maximum likelihood estimator for the parameter of the left tail is

where c03-math-628 for c03-math-629 and c03-math-630 has support c03-math-631. The maximum likelihood estimator for the parameter of the right tail is

where c03-math-633 for c03-math-634.

3.4.2 Parametric Families for Excess Distributions

We describe the following one- and two-parameter families:

  1. 1. One-parameter families. The exponential and Pareto distributions.
  2. 2. Two-parameter families. The gamma, generalized Pareto, and Weibull distributions.

Furthermore, we describe a three parameter family which contains many one- and two-parameter families as special cases.

The exponential distributions have a heavier tail than the normal distributions. The Pareto distributions have a heavier tail than the exponential distributions, but an equally heavy tail as the Student distributions. The Pareto densities have polynomial tails, the exponential densities have exponential tails, and the gamma densities have densities whose heaviness is between the Pareto and the exponential densities.

3.4.2.1 The Exponential Distributions

The exponential densities are defined as

where c03-math-636 is the scale parameter. The parameter c03-math-637 is called the rate parameter. The distribution function and the quantile function are

equation

The expectation and the variance are

3.66 equation

where c03-math-639 is a random variable following the exponential distribution.

Maximum Likelihood Estimation: Exponential Distribution

When we observe c03-math-640, which are i.i.d. with exponential distribution, then the maximum likelihood estimator is18

Regression Method: Exponential Distribution

Regression plots were shown in Figures 3.4 and 3.5. We study further the regression method for fitting an exponential distribution.

For exponential distributions the logarithm of the survival function c03-math-642 is a linear function, which can be used to visualize data and to estimate the parameter of the exponential distribution (see Section 3.2.1). Let c03-math-643 be a sample from an exponential distribution and assume c03-math-644 Let c03-math-645 be the empirical distribution function, based on the observations c03-math-646, defined as c03-math-647. The empirical distribution function is defined in (3.30), but we modify the definition so that the divisor is c03-math-648 instead of c03-math-649. We use the facts that (for the ordered data)

equation

Thus,

equation

The least squares estimator of c03-math-650 is19

Now we can write

equation

where

Thus, more weight is given to the observations in the extreme tails.20

Figure 3.15 shows the fitting of regression estimates for the S&P 500 daily returns, described in Section 2.4.1. Panel (a) considers the left tail and panel (b) the right tail. The tails are defined by the c03-math-663th empirical quantiles for c03-math-664/c03-math-665 (blue), c03-math-666/c03-math-667 (green), and c03-math-668/c03-math-669 (red). We also show the fitted linear regression lines.

Graphical illustration of Exponentialmodel for S&P 500 daily returns: Regression fits.

Figure 3.15 Exponential model for S&P 500 daily returns: Regression fits. Panel (a) considers the left tail and panel (b) the right tail. We show the regression data and the fitted regression lines for c03-math-670/c03-math-671 (blue), c03-math-672/c03-math-673 (green), and c03-math-674/c03-math-675 (red).

3.4.2.2 The Pareto Distributions

We define first the class of Pareto distributions with the support c03-math-676, where c03-math-677. The class of Pareto distributions with support c03-math-678 is obtained by translation.

The Pareto distributions are parameterized by the tail index c03-math-679. Parameter c03-math-680 is taken to be known, but in the practice of tail estimation c03-math-681 is used to define the tail area and c03-math-682 chosen by a quantile estimator. The density function is

where c03-math-684 is the tail index. The distribution function and the quantile function are

Pareto Distributions as Excess Distributions

Assumption (3.57) says that the excess distribution is modeled with a parametric distribution whose support is c03-math-686. The density function of a Pareto distribution can be moved by the translation c03-math-687 to have the support c03-math-688, which gives the density function21

Now we could consider c03-math-690 as the scaling parameter, which leads to the two-parameter Pareto distributions, which are called the generalized Pareto distributions, and defined in (3.82) and (3.84).

Maximum Likelihood Estimation: Pareto Distribution

When c03-math-691 follows the Pareto distribution with parameters c03-math-692 and c03-math-693, then c03-math-694 follows the exponential distribution with scale parameter c03-math-695. Indeed, c03-math-696 and thus c03-math-697. We observed in (3.67) that scale parameter c03-math-698 of the exponential distribution can be estimated with c03-math-699. Thus, the maximum likelihood estimator of c03-math-700 is

equation

The maximum likelihood estimator of the shape parameter c03-math-701 of the Pareto distribution is22

We are more interested in estimating c03-math-705, since it appears in the quantile function.

Regression Method: Pareto Distribution

Regression plots were shown in Figures 3.6 and 3.7. We study further the regression method for fitting a Pareto distribution.

Let us consider the estimation of the tail index c03-math-706 and the inverse c03-math-707. The basic idea is that the logarithm of the distribution function c03-math-708 or the logarithm of the survival function c03-math-709 are linear in c03-math-710: From (3.78) we get that c03-math-711, and from (3.79) we get that c03-math-712.

Let c03-math-713 be a sample from a Pareto distribution and assume

equation

Let c03-math-714 be the empirical distribution function, based on the observations c03-math-715, defined as c03-math-716. The empirical distribution function is defined in (3.30), but we modify the definition so that the divisor is c03-math-717 instead of c03-math-718. We use the facts that

equation

Thus,

equation

The least squares estimator of c03-math-719 is

see (3.68) for the least squared formula. The estimator of c03-math-721 can be written as

equation

where c03-math-722 is defined in (3.70). More weight is given to the observations in the extreme tails.

To estimate c03-math-723, instead of c03-math-724, we use

equation

The least squares estimator of c03-math-725 is

Figure 3.16 shows the fitting of regression estimates for the S&P 500 daily returns, described in Section 2.4.1. Panel (a) considers the left tail and panel (b) the right tail. The tails are defined by the c03-math-727th empirical quantiles for c03-math-728/c03-math-729 (blue), c03-math-730/c03-math-731 (green), and c03-math-732/c03-math-733 (red). We also show the fitted linear regression lines. If the tails are Pareto tails, then the points should be on a straight line whose slope is equal to c03-math-734. We can see that the slopes increase when we move to the more extreme parts of the tail (c03-math-735 decreases).

Graphical illustration of Pareto model for S&P 500 daily returns: Regression fits.

Figure 3.16 Pareto model for S&P 500 daily returns: Regression fits. Panel (a) considers the left tail and panel (b) the right tail. We show the regression data and the fitted regression lines for c03-math-736/c03-math-737 (blue), c03-math-738/c03-math-739 (green), and c03-math-740/c03-math-741 (red).

Pareto Tails

The Student distributions have Pareto tails, as written in (3.55). The LĂ©vy distributions with c03-math-742 have Pareto tails, as written in (3.94).

A distribution of random variable c03-math-743 with distribution function c03-math-744 is said to have a Pareto right tail when

for c03-math-746, for some c03-math-747, where c03-math-748 is a slowly varying function at c03-math-749:

equation

for all c03-math-750.23 A distribution is said to have a Pareto left tail when

for c03-math-755, for some c03-math-756, where c03-math-757 is a slowly varying function.

For example, if density function c03-math-758 satisfies

equation

for c03-math-759, where c03-math-760, c03-math-761, and c03-math-762, then the distribution has a Pareto right tail. If

equation

for c03-math-763, where c03-math-764, c03-math-765, and c03-math-766, then the distribution has a Pareto left tail.

3.4.2.3 The Gamma Distributions

For the gamma distributions the density functions have a closed form expression but the distribution functions and the maximum likelihood estimator cannot be written in a closed form.

The gamma densities are defined as

where c03-math-768, c03-math-769 and the normalization constant is

equation

where c03-math-770 is the gamma function. The distribution function is

equation

where the lower incomplete gamma function is defined as

equation

for c03-math-771 and c03-math-772.

When c03-math-773, then we obtain the family of exponential distributions. When c03-math-774, then the gamma densities have a tail that is heavier than the exponential densities but lighter than the Pareto densities. When c03-math-775, then the gamma densities have a tail that is lighter than the exponential densities.

Assuming independent and identically distributed observations c03-math-776 the logarithmic likelihood is

3.81 equation

The maximum likelihood estimator of parameter c03-math-778, given c03-math-779, is

equation

The maximum likelihood estimator of c03-math-780 is the maximizer of c03-math-781 over c03-math-782. The maximum likelihood estimator of c03-math-783 is c03-math-784.

3.4.2.4 The Generalized Pareto Distributions

The one-parameter Pareto distributions were defined in (3.73) and (3.72). We define the two-parameter generalized Pareto distributions, which contain the exponential distributions as a limiting case.

The density functions, distribution functions, and quantile functions have a closed form expression but the maximum likelihood estimator does not have a closed form expression.

The density functions of the generalized Pareto distributions are

where c03-math-786 and c03-math-787. The distribution functions are

The quantile functions are

equation

When c03-math-789, then the distributions are exponential distributions, defined in (3.65).

The generalized Pareto distribution can be defined for the cases c03-math-790. In this case the support is c03-math-791. See (3.101) for the distribution function and (8.65) for the density function. The generalized Pareto distributions are obtained as limit distributions for threshold exceedances (see Section 3.5.2).

For the calculation of the maximum likelihood estimation it is convenient to use the following parameterization. We define the class of generalized Pareto distributions using the tail index c03-math-792 (shape parameter) and the scaling parameter c03-math-793 by defining the density function as

The parameters of the generalized Pareto distribution (3.84) are related to the parameterization in (3.83) by c03-math-795 and c03-math-796. Note that the densities (3.84) can be obtained heuristically from a translation of the one-parameter Pareto distributions, as written in (3.74).

The maximum likelihood estimator cannot be expressed in a closed form but we can reduce the numerical maximization of the two-variate likelihood function to the numerical maximization of a univariate function. For the computation of the maximum likelihood estimator, we use the parameterization of the density as in (3.84).

The logarithmic likelihood function for i.i.d. observations c03-math-797 is

3.85 equation

Setting the partial derivative equal to zero and solving for c03-math-799 gives24

equation

The maximum likelihood estimator c03-math-801 for c03-math-802 is the maximizer of the univariate function c03-math-803 over c03-math-804. The maximum likelihood estimator for c03-math-805 is c03-math-806. The maximum likelihood estimators for c03-math-807 and c03-math-808 are

equation

3.4.2.5 The Weibull Distributions

For the Weibull distributions the density functions, distribution functions, and quantile functions have a closed form expression but the maximum likelihood estimator cannot be written in a closed form.

The Weibull densities are defined as

where c03-math-810 is the shape parameter and c03-math-811 is the scale parameter. The distribution function is

equation

The quantile function is

equation

For c03-math-812 we obtain the exponential distribution. The Weibull distributions are also called stretched exponential distributions because c03-math-813 is a stretched exponential function.

The maximum likelihood estimator cannot be expressed in a closed form but we can reduce the numerical maximization of the two-variate likelihood function to the numerical maximization of a univariate function. The logarithmic likelihood function for i.i.d. observations c03-math-814 is

3.87 equation

Setting the partial derivative equal to zero and solving for c03-math-816 gives25

equation

The maximum likelihood estimator c03-math-818 for c03-math-819 is the maximizer of the univariate function c03-math-820 over c03-math-821. The maximum likelihood estimator for c03-math-822 is c03-math-823.

3.4.2.6 A Three Parameter Family

A flexible family for the modeling of the right tail is defined in Malevergne and Sornette (2005, p. 57) by density functions

3.88 equation

where c03-math-825 is the starting point of the distribution, c03-math-826, and c03-math-827. When c03-math-828, then c03-math-829. The normalization constant c03-math-830 has the expression

equation

where c03-math-831 is the nonnormalized incomplete Gamma function.

The family contains several sub-families:

  1. 1. The exponential density is obtained when c03-math-832, c03-math-833, c03-math-834, and c03-math-835. The exponential densities are c03-math-836, where c03-math-837. We defined exponential densities in (3.65).
  2. 2. The Pareto density is obtained when c03-math-838 and c03-math-839. The Pareto densities are c03-math-840, where c03-math-841 and c03-math-842. We defined Pareto densities in (3.72).
  3. 3. The gamma density is obtained by choosing c03-math-843 and c03-math-844. The gamma densities are c03-math-845, where c03-math-846 and c03-math-847. The gamma densities were defined in (3.80).
  4. 4. The Weibull density is obtained when c03-math-848, c03-math-849, and c03-math-850:
    equation
  5. where c03-math-851. The Weibull densities were defined in (3.86).
  6. 5. The incomplete gamma density is obtained when c03-math-852 and c03-math-853:
    equation
  7. where c03-math-854.

The Pareto density and the stretched exponential density can be interpolated smoothly by the log-Weibull density

equation

where c03-math-855.

3.4.3 Fitting the Models to Return Data

We fit models first to S&P 500 returns, and then to a collection of individual stocks in S&P 500. Fitting of the distributions gives background for the quantile estimation of Chapter 8.

3.4.3.1 S&P 500 Daily Returns: Maximum Likelihood

We fit one-parameter models (exponential and Pareto) and two-parameter models (gamma, generalized Pareto, and Weibull) to the tails of S&P 500 daily returns. The S&P 500 daily data is described in Section 2.4.1.

We study maximum likelihood estimators (3.63) and (3.64). The estimates are constructed using data

equation

for the left and the right tails, respectively. Threshold c03-math-856 is the c03-math-857th empirical quantile, and c03-math-858 is the c03-math-859th empirical quantile, where c03-math-860. The estimators c03-math-861 and c03-math-862 depend on the parameter c03-math-863.

To show the sensitiveness of the estimates with respect the parameter c03-math-864 we plot the values of the estimates as a function of c03-math-865. These plots are related to the Hill's plot, which name is used in the case of estimating parameter c03-math-866 of the Pareto distribution.

To characterize the goodness of fit we show tail plots, as defined in Section 3.2.1. The tail plots include both the observations and the fitted curves, for several values of c03-math-867.

The one-parameter models indicate that the left tail is heavier than the right tail. However, the two-parameter families seem to give much better fits than the one-parameter families.

The Exponential Model

The maximum likelihood estimator of the parameter of the exponential distribution is given in (3.67). The estimators for the parameters of the left tail and the right tail are obtained from (3.63) and (3.64) as

equation

where c03-math-868 and c03-math-869 are defined in (3.61) and (3.62). The estimates c03-math-870 and c03-math-871 are related to the estimates of the expected shortfall in (3.28) and (3.27).

Figure 3.17 shows estimates of the parameter c03-math-872 and c03-math-873 of the exponential distribution. Panel (a) shows estimates of c03-math-874 and panel (b) shows estimates of c03-math-875, as a function of c03-math-876. Parameter c03-math-877 occurs in the quantile function, and is more important in quantile estimation, but for the convenience of the reader we also show the estimates of the rate parameter c03-math-878. The red curves show the maximum likelihood estimates for the left tail, and the blue curves show the maximum likelihood estimates for the right tail. In addition, we show the values of the regression estimates (3.69) and (3.71). The pink curves show the regression estimates for the left tail, and the green curves show the regression estimates for the right tail. We see that the estimates for c03-math-879 are larger for the left tail than for the right tail. This indicates that the left tail is heavier than the right tail. The estimates become smaller when c03-math-880 increases. The regression estimates are larger than the maximum likelihood estimates. For the estimates of c03-math-881 the behavior is opposite.

Graphical illustration of Exponential model for S&P 500 daily returns: Parameter estimates.

Figure 3.17 Exponential model for S&P 500 daily returns: Parameter estimates. Panel (a) shows estimates of c03-math-882 and panel (b) shows estimates of c03-math-883, as a function of c03-math-884. Red and blue: the maximum likelihood estimates; pink and green: the regression estimates; red and pink: the left tail; blue and green: the right tail.

Figure 3.18 shows tail plots, defined in Section 3.2.1. Panel (a) shows the left tail plots and panel (b) shows the right tail plots. The red and green points show the observed data and the black lines show the exponential distribution functions when parameter c03-math-885 is estimated with maximum likelihood. The four black curves show the cases c03-math-886, c03-math-887, c03-math-888, and c03-math-889. The tails are fitted better with small values of c03-math-890.

Graphical illustration of Exponentialmodel for S&P 500 daily returns: Tail plots withmaximum likelihood.

Figure 3.18 Exponential model for S&P 500 daily returns: Tail plots with maximum likelihood. Panel (a) shows the left tail plots and panel (b) shows the right tail plots. The red and green points show the observed data and the black lines show the exponential fits with c03-math-891, c03-math-892, c03-math-893, and c03-math-894.

The Pareto Model

The maximum likelihood estimator of the parameter of the Pareto distribution is given in (3.75).26 The estimators for the parameters of the left and the right tails are obtained from (3.63) and (3.64) as

where c03-math-898 with c03-math-899 for the left tail, and c03-math-900 with c03-math-901 for the right tail. Now c03-math-902. The maximum likelihood estimators are called Hill's estimators.27

Figure 3.19 shows estimates of the parameter c03-math-908 and c03-math-909 of the Pareto distribution. Panel (a) shows estimates of c03-math-910 and panel (b) shows estimates of c03-math-911, as a function of c03-math-912. The plot in panel (b) is known as Hill's plot. Parameter c03-math-913 occurs in the quantile function, and is more important in quantile estimation, but for the convenience of the reader we also show the estimates of parameter c03-math-914. The red curves show the maximum likelihood estimates for the left tail and the blue curves show the maximum likelihood estimates for the right tail. In addition, we show the values of regression estimates of c03-math-915, defined in (3.76), and the values of regression estimates of c03-math-916, defined in (3.77). The pink curves show the regression estimates for the left tail and the green curves show the regression estimates for the right tail. We see that the estimates of c03-math-917 are larger for the left tail than for the right tail, which means that the left tail is estimated to be heavier than the right tail. The estimates of c03-math-918 become larger when c03-math-919 increases. The regression estimates of c03-math-920 are smaller than the maximum likelihood estimates. For the estimates of c03-math-921 the behavior is opposite.

Graphical illustration of Pareto model for S&P 500 daily returns: Parameter Estimates.

Figure 3.19 Pareto model for S&P 500 daily returns: Parameter estimates. Panel (a) shows estimates of c03-math-922 and panel (b) shows estimates of c03-math-923 as a function of c03-math-924. Red and blue: the maximum likelihood estimates; pink and green: the regression estimates; red and pink: the left tail; blue and green: the right tail.

Figure 3.20 shows tail plots. Panel (a) shows the left tail plots and panel (b) shows the right tail plots. The red and green points show the observed data and the black curves show the Pareto distribution functions when parameter c03-math-925 is estimated with maximum likelihood. The four black curves show the cases c03-math-926, c03-math-927, c03-math-928, and c03-math-929.

Graphical illustration of Pareto model for S&P 500 daily returns: Tail plots with maximum likelihood.

Figure 3.20 Pareto model for S&P 500 daily returns: Tail plots with maximum likelihood. Panel (a) shows the left tail plots and panel (b) shows the right tail plots. The red and green points show the observed data and the black curves show the fits with c03-math-930, c03-math-931, c03-math-932, and c03-math-933.

The Gamma Model

The gamma densities are defined in (3.80). The maximum likelihood estimators for the scale parameter c03-math-934 and for the shape parameter c03-math-935 of a gamma distribution do not have a closed form expression, but the computation can be done by minimizing a univariate function. We get the maximum likelihood estimates for the parameters of the left tail and the right tail by applying the numerical procedure for the observations

for the left and the right tails, respectively, where c03-math-937.

Figure 3.21(a) shows estimates of c03-math-938 and panel (b) shows estimates of c03-math-939. The red curves show the estimates for the left tail, and the blue curves show the estimates for the right tail. We see that the estimates for c03-math-940 are larger for the left tail than for the right tail. The estimates become smaller when c03-math-941 increases.

Graphical illustration of Gamma model for S&P 500 daily returns: Parameter estimates.

Figure 3.21 Gamma model for S&P 500 daily returns: Parameter estimates. Panel (a) shows estimates of c03-math-942 and panel (b) shows estimates of c03-math-943, as a function of c03-math-944. Red: the left tail; blue: the right tail.

Figure 3.22 shows tail plots. Panel (a) shows the left tail plots and panel (b) shows the right tail plots. The red and green points show the observed data and the black curves show the gamma distribution functions when parameters are estimated with maximum likelihood. The four black curves show the cases c03-math-945, c03-math-946, c03-math-947, and c03-math-948.

Graphical illustration of Gamma model for S&P 500 daily returns: Tail plots withmaximum likelihood.

Figure 3.22 Gamma model for S&P 500 daily returns: Tail plots with maximum likelihood. Panel (a) shows the left tail plots and panel (b) shows the right tail plots. The red and green points show the observed data and the black lines show the fits with c03-math-949, c03-math-950, c03-math-951, and c03-math-952.

The Generalized Pareto Model

The density of a generalized Pareto distribution is given in (3.82). The maximum likelihood estimators for the scale parameter c03-math-953 and for the shape parameter c03-math-954 of a generalized Pareto distribution do not have a closed form expression, but the computation can be done by minimizing a univariate function. We get the maximum likelihood estimates for the parameters of the left tail and the right tail by applying the numerical procedure for the observations in (3.92).

Figure 3.23(a) shows estimates of c03-math-955, and panel (b) shows estimates of c03-math-956. The red curves show the estimates for the left tail, and the blue curves show the estimates for the right tail. The estimates of c03-math-957 become smaller when c03-math-958 increases.

Graphical illustration of Pareto model for S&P 500 daily returns: Parameter estimates.

Figure 3.23 Generalized Pareto model for S&P 500 daily returns: Parameter estimates. Panel (a) shows estimates of c03-math-959 and panel (b) shows estimates of c03-math-960, as a function of c03-math-961. Red shows the estimates for the left tail, and blue shows them for the right tail.

Figure 3.24 shows tail plots. Panel (a) shows the left tail plots and panel (b) shows the right tail plots. The red and green points show the observed data and the black curves show the distribution functions when parameters are estimated using maximum likelihood. The four black curves show the cases c03-math-962, c03-math-963, c03-math-964, and c03-math-965. The fitted curves do not change in a monotonic order when c03-math-966 is decreased.

Graphical illustration for Pareto model for S&P 500 daily returns: Tail plots with maximum likelihood.

Figure 3.24 Generalized Pareto model for S&P 500 daily returns: Tail plots with maximum likelihood. Panel (a) shows the left tail plots and panel (b) shows the right tail plots. The red and green points show the observed data and the black curves show the fits with c03-math-967, c03-math-968, c03-math-969, and c03-math-970.

The Weibull Model

The Weibull densities are given in (3.86). The maximum likelihood estimators for the scale parameter c03-math-971 and for the shape parameter c03-math-972 of a Weibull distribution do not have a closed form expression, but the computation can be done by minimizing a univariate function. We get the maximum likelihood estimates for the parameters of the left tail and the right tail by applying the numerical procedure for the observations in (3.92).

Figure 3.25(a) shows estimates of c03-math-973, and panel (b) shows estimates of c03-math-974. The red curves show the estimates for the left tail, and the blue curves show the estimates for the right tail. The estimates of c03-math-975 become smaller when c03-math-976 increases.

Graphical illustration for Weibull model for S&P 500 daily returns: Parameter estimates.

Figure 3.25 Weibull model for S&P 500 daily returns: Parameter estimates. Panel (a) shows estimates of c03-math-977 and panel (b) shows estimates of c03-math-978, as a function of c03-math-979. Red shows the estimates for the left tail, and blue shows them for the right tail.

Figure 3.26 shows tail plots. Panel (a) shows the left tail plots and panel (b) shows the right tail plots. The red and green points show the observed data and the black curves show the distribution functions when parameters are estimated using maximum likelihood. The four black curves show the cases c03-math-980, c03-math-981, c03-math-982, and c03-math-983.

Graphical illustration for Weibullmodel for S&P 500 daily returns: Tail plots withmaximum likelihood.

Figure 3.26 Weibull model for S&P 500 daily returns: Tail plots with maximum likelihood. Panel (a) shows the left tail plots and panel (b) shows the right tail plots. The red and green points show the observed data and the black curves show the fits with c03-math-984, c03-math-985, c03-math-986, and c03-math-987.

3.4.3.2 Tail Index Estimation for S&P 500 Components

We study fitting of the Pareto model for the daily returns of stocks in S&P 500 index. S&P 500 components data is described in Section 2.4.5.

Figure 3.27 shows how c03-math-988 and c03-math-989 are distributed. The estimators are defined in (3.89); these are Hill's estimators for the left and right Pareto indexes. Panel (a) shows the distribution of the estimates of the left tail index and panel (b) shows the distribution of the estimates of the right tail index. We have computed the estimates for each 312 stocks in the S&P 500 components data set, and the kernel density estimator is applied for this data set of 312 observations. This is done for c03-math-990. The smoothing parameter is chosen by the normal reference rule, and the standard Gaussian kernel function is used. A smaller c03-math-991 gives a smaller estimate of c03-math-992.

Graphical illustration for Density estimates of the distribution of Hill's estimates.

Figure 3.27 Density estimates of the distribution of Hill's estimates. (a) Distribution of the left tail index; (b) the right tail index. Hill's estimates are calculated for the 312 stocks and kernel estimates are calculated from 312 estimated values of c03-math-993. There is an kernel estimate for each c03-math-994.

Figure 3.28 shows a scatter plot of points c03-math-995, when the estimates are computed for each stock in the S&P 500 components data. We have used c03-math-996. There are about the same number of stocks for which the left tail index is smaller than the right tail index

Scatter plot illustrating estimates of α.

Figure 3.28 A scatter plot of estimates of c03-math-997. We show a scatter plot of points c03-math-998 for the stocks in the S&P 500 components data. The red line shows the points with c03-math-999.

3.5 Asymptotic Distributions

First we describe central limit theorems and second we describe limit theorems for the excess distribution. The limit distributions of the central limit theorems can be used to model the complete return distribution of a financial asset and the limit distributions for the excess distribution can be used to model the tail areas of the return distribution of a financial asset.

3.5.1 The Central Limit Theorems

We applied a central limit theorem for sums in (3.46) and (3.49) to justify the normal and the log-normal model for the stock prices. In a similar way we can apply the central limit theorems to justify alternative models for the stock prices. When the variance of the summands is finite the limit is a normal distribution, but if the variance is not finite, the limit distributions can have heavier tails than the normal distributions.

We describe first a central limit theorem for sums of independent but not necessarily identically distributed random variables. The limit distributions belong to the class of infinitely divisible distributions. Second we describe central limit theorems for sums of independent and identically distributed random variables. Now the limit distributions belong to the class of stable distributions. The class of stable distributions is a subset of the class of infinitely divisible distributions. The stable distributions include the normal distributions but they include also heavy tailed distributions, which can be used to describe phenomena where both very large and very small values can be observed, like the stock returns.

Third we consider the case of sums of dependent random variables. When the dependence is weak, then a convergence towards a normal distribution occurs, but the asymptotic variance is affected by the dependence.

We do not apply stable distributions or infinitely divisible distributions to model return distributions, but it is useful to note that heavy tailed distributions arise already from central limit theorems, and not only from limit distributions for the excess distribution.

3.5.1.1 Sums of Independent Random Variables

The Khintchine theorem states that for a distribution to be a limit distribution of a sum of independent (but not necessarily identically distributed) random variables it is necessary and sufficient that the distribution is infinitely divisible; see Billingsley (2005, pp. 373–374) and Breiman (1993, p. 191).

The infinitely divisible distributions are such that a random variable following an infinitely divisible distribution can be represented as a sum of c03-math-1000 i.i.d. random variables for each natural number c03-math-1001. In other words, a distribution function c03-math-1002 is infinitely divisible if for each c03-math-1003 there is a distribution function c03-math-1004 such that c03-math-1005 is the c03-math-1006-fold convolution c03-math-1007.28 For example, the normal, Poisson, and gamma distributions are infinitely divisible but the uniform distributions are not. See Billingsley (2005, Chapter 5) and Breiman (1993, Section 9.5) about infinitely divisible distributions.

Let c03-math-1017, c03-math-1018, be a triangular array of row-wise independent random variables which satisfy

equation

as c03-math-1019, for every c03-math-1020. Then c03-math-1021 can be normalized to converge to an infinitely divisible distribution.

3.5.1.2 Sums of Independent and Identically Distributed Random Variables

For a distribution to be a limit distribution of a sum of independent and identically distributed random variables it is necessary and sufficient that the distribution is stable.

Stable Distributions

A random variable is said to have a stable distribution, if for every natural number c03-math-1022 and for c03-math-1023 independent and with the same distribution as c03-math-1024, there are constants c03-math-1025 and c03-math-1026 such that

equation

holds in distribution; see Breiman (1993, p. 199). Stable distributions are infinitely divisible distributions, because the distribution function of c03-math-1027 is the c03-math-1028-fold convolution of c03-math-1029, where c03-math-1030 is the distribution function of c03-math-1031. In particular, the sum of two independent and identically distributed stable random variables has also a stable distribution.

Density functions of stable distributions cannot be written in a closed form in general. The characteristic function of a stable distribution is

equation

where

equation

Note that c03-math-1032 is the sign of c03-math-1033, and we can define c03-math-1034. Parameter c03-math-1035 is the exponent of the distribution, which is related to the heaviness of the tails, c03-math-1036 is the location term, c03-math-1037 is the scale factor, and c03-math-1038 is the asymmetry parameter (skewness parameter). When c03-math-1039, then distribution is symmetric, when c03-math-1040, then distribution is skewed to the right, and when c03-math-1041, the distribution is skewed to the left. See Breiman (1993, p. 204).

The analytical form of the density is known for c03-math-1042 (Gaussian), c03-math-1043, c03-math-1044 (Cauchy), and c03-math-1045, c03-math-1046 (Lévy–Smirnov or Lévy). The density of the Cauchy distribution is given by

equation

The Cauchy distribution is the Student distribution for the degrees of freedom c03-math-1047. The density of the Lévy–Smirnov distribution is given by

equation

Symmetric stable distributions are stable distributions with location parameter c03-math-1048 and skewness parameter c03-math-1049. The characteristic function of a symmetric stable distribution is

equation

where c03-math-1050 and c03-math-1051. The density of a symmetric stable distribution can be written as a series expansion

where c03-math-1053 is defined through

equation

Symmetric stable distributions have the power-law behavior of the tails:

Equation (3.94) gives the leading asymptotic term in (3.93). For the distributions with Pareto tails the c03-math-1055th moment does not exist if c03-math-1056. This implies that the variance of a symmetric stable distribution is always infinite, and the mean is infinite when c03-math-1057. The mode is used as the location parameter of the symmetric stable distributions (symmetric stable distributions are unimodal).

Convergence to a Stable Distribution

The central limit theorems were presented in Gnedenko and Kolmogorov (1954), Feller (1957), and Feller (1966). We follow the exposition of Embrechts et al. (1997, Theorem 2.2.15). Assume that c03-math-1058 are independent and identically distributed with the same distribution as c03-math-1059.

  1. 1. Assume that c03-math-1060. Then,
    equation
  2. where c03-math-1061 and c03-math-1062.
  3. 2. Assume that
    equation
  4. is slowly varying.29
  5. Let c03-math-1067 be the solution of the equation
  6. where
    equation
  7. Then,
    equation
  8. where c03-math-1069. It holds that c03-math-1070 for a slowly varying function c03-math-1071.
  9. 3. Assume that the distribution function c03-math-1072 of c03-math-1073 satisfies
    equation
  10. as c03-math-1074, where c03-math-1075 is slowly varying, and c03-math-1076, c03-math-1077. Let c03-math-1078 be the solution of (3.95). Then,
    equation
  11. where c03-math-1079 and c03-math-1080 is a stable distribution with c03-math-1081.

3.5.1.3 Sums of Dependent Random Variables

We apply a limit theorem for dependent random variables in Sections 6.2.2 and 10.1.2.

Let c03-math-1082 be a strictly stationary time series. We define the weak dependence in terms of a condition on the c03-math-1083-mixing coefficients. Let c03-math-1084 denote the sigma algebra generated by random variables c03-math-1085. The c03-math-1086-mixing coefficient is defined as

equation

where c03-math-1087. Now we can state the central limit theorem. Let c03-math-1088 and c03-math-1089 for some constant c03-math-1090. Then,

where

equation

c03-math-1092, and we assume that c03-math-1093. Ibragimov and Linnik (1971, Theorem 18.4.1) gave necessary and sufficient conditions for a central limit theorem under c03-math-1094-mixing conditions. A proof for our statement of the central limit theorem in (3.96) can be found in Peligrad (1986); see also Fan and Yao (2005, Theorem 2.21) and Billingsley (2005, Theorem 27.4).

3.5.2 The Limit Theorems for Maxima

Since we have modeled the excess distribution parametrically, it is of special interest that the limit distribution of the excess distribution is a generalized Pareto distribution; this limit theorem is stated in (3.102). The weak convergence of maxima is related to the convergence of the excess distribution.

3.5.2.1 Weak Convergence of Maxima

Let the real valued random variables c03-math-1095 be independent and identically distributed, and denote the maximum

equation

Sometimes convergence in distribution holds in the sense that there exists sequences c03-math-1096 and c03-math-1097 where c03-math-1098 and c03-math-1099 so that

for all c03-math-1101, as c03-math-1102, where c03-math-1103 is a distribution function, c03-math-1104, and c03-math-1105. The Fisher–Tippett–Gnedenko theorem states that if the convergence in (3.97) holds, then c03-math-1106 can only be a Fréchet, Weibull, or Gumbel distribution function. See Fisher and Tippett (1928), Gnedenko (1943), and Embrechts et al. (1997, p. 121).

To derive the result for the minimum we use the fact that for

equation

we have c03-math-1107. Let us denote

equation

so that c03-math-1108. Now,

3.5.2.2 Extreme Value Distributions

The Fréchet distribution functions are

3.99 equation

where c03-math-1111. The Weibull distribution functions are

equation

where c03-math-1112. The Gumbel distribution function is

equation

These distributions are called the extreme value distributions.

Define

equation

Then,

where c03-math-1114 is defined on set c03-math-1115. This is known as the Jenkinson–von Mises representation of the extreme value distributions, or the generalized extreme value distribution; see Embrechts et al. (1997, p. 152). We obtain the parametric class of possible limit distributions

equation

where c03-math-1116 is the shape parameter, c03-math-1117, and c03-math-1118. The support of the distribution is c03-math-1119.

Using (3.98), we obtain the class of limit distribution functions for the minima. The limit distribution functions are

equation

where c03-math-1120, c03-math-1121, c03-math-1122, and c03-math-1123. Distribution function c03-math-1124 is defined on set c03-math-1125.

3.5.2.3 Convergence to an Extreme Value Distribution

If the distribution that generated the observations c03-math-1126 has polynomial tails, then (3.97) holds and the limit distribution of the maximum belongs to the Fréchet class. More precisely, if

equation

for some slowly varying function c03-math-1127, then a normalized maximum converges to a Fréchet distribution c03-math-1128; see Embrechts et al. (1997, p. 131).

Let c03-math-1129 be the endpoint of the distribution of c03-math-1130. If c03-math-1131 and

equation

for some slowly varying function c03-math-1132, then a normalized maximum converges to a Weibull distribution c03-math-1133; see Embrechts et al. (1997, p. 135). The equation

equation

explains the relation between the convergence to a Fréchet distribution and to a Weibull distribution.

If the distribution which generated the observations is exponential, normal, or log-normal, then (3.97) holds and the limit distribution of the maximum is the Gumbel distribution. See Embrechts et al. (1997, p. 145).

3.5.2.4 Generalized Pareto Distributions

The distribution function of the generalized Pareto distribution is

where c03-math-1135. When c03-math-1136, then c03-math-1137. When c03-math-1138, then c03-math-1139. When c03-math-1140, then the distributions are exponential distributions. Note that

equation

where c03-math-1141 is the distribution function of a generalized extreme value distribution, as defined in (3.100). Parameter c03-math-1142 is a shape parameter and parameter c03-math-1143 is a scale parameter. The Pareto distributions were defined in (3.73) and (3.83).

3.5.2.5 Convergence to a Generalized Pareto Distribution

Let c03-math-1144 be a random variable and let c03-math-1145 be the distribution function of c03-math-1146. We define the excess distribution with threshold c03-math-1147 as the distribution with the distribution function

equation

We can typically approximate the distribution function c03-math-1148 with the distribution function of a generalized Pareto distribution. This follows from the Gnedenko–Pickands–Balkema–de Haan theorem; see Embrechts et al. (1997, p. 158). Let c03-math-1149. The Gnedenko–Pickands–Balkema–de Haan theorem states that

for some positive function c03-math-1151 if and only if c03-math-1152 belongs to the maximum domain of attraction of c03-math-1153, where c03-math-1154. To say that c03-math-1155 belongs to the maximum domain of attraction of c03-math-1156 means that (3.97) holds for some sequences c03-math-1157 and c03-math-1158.

The basic idea of deriving the limit distribution of the excess distribution from the limit distribution of the maximum comes from the Poisson approximation. The Poisson approximation states that

equation

and

equation

are equivalent, where c03-math-1159, c03-math-1160 is a sequence of real numbers, and c03-math-1161 is the maximum of i.i.d. random variables; see Embrechts et al. (1997, p. 116).30

When the distribution function of the maximum c03-math-1167 can be approximated by

equation

for some c03-math-1168 and c03-math-1169, then c03-math-1170 can be approximated by the distribution function

equation

defined on set c03-math-1171, where

equation
Graphical illustration for Simulated i.i.d. time series.

Figure 3.29 Simulated i.i.d. time series. We have simulated 10,000 observations. (a) Student's c03-math-1172-distribution with degrees of freedom c03-math-1173; (b) Student's c03-math-1174-distribution with degrees of freedom c03-math-1175; (c) Gaussian distribution. The mean of the observations is zero and the standard deviation is equal to the standard deviation of the S&P 500 returns.

3.6 Univariate Stylized Facts

The heaviness of the tails is one of the main univariate stylized facts. There are several questions related to the heaviness of the tails. We give a list of the observations that can be obtained from the figures of this chapter, and give some references to the literature.

  1. 1. How heavy are the tails of S&P 500 returns?

    Figure 2.1(b) shows a time series of S&P 500 daily returns. To highlight the heaviness of the tails we can compare the real time series with the simulated time series in Figure 3.29. Panel (a) shows uncorrelated observations whose distribution is the c03-math-1176-distribution with three degrees of freedom, in panel (b) the c03-math-1177-distribution has six degrees of freedom, and in panel (c) the distribution of the observations is Gaussian.31

    Figure 3.2 shows tail plots of S&P 500 daily returns: c03-math-1184-distribution with degrees of freedom three and four gives reasonable fits both for the left tail and the right tails.

    Figure 3.4 shows exponential regression plots of S&P 500 daily returns: The tails seem to be heavier than the exponential tails.

    Figure 3.5 shows exponential regression plots of S&P 500 daily returns, and fits both exponential and Pareto distributions: Pareto fits seem to be better.

    Figure 3.6 shows Pareto regression plots of S&P 500 daily returns: The tails seem to fit reasonably well for the Pareto model.

    Figure 3.7 shows Pareto regression plots of S&P 500 daily returns, and fits both exponential and Pareto distributions: Pareto fits seem to be better.

    Figure 3.13 shows how estimates of parameters c03-math-1185 and c03-math-1186 of Student distribution for S&P 500 components are distributed: The mode of c03-math-1187 is about 3.5 and the range of values of the estimates is about c03-math-1188.

    Figure 3.27 shows kernel density estimates of the distribution of the estimates of Pareto left tail index and Pareto right tail index for S&P 500 components: The choice of parameter c03-math-1189 has a significant influence on the value of the estimate, but we are in the range c03-math-1190.

  2. 2. How the heaviness of the tails varies across asset classes (stocks, bonds, indexes)?

    Figure 2.5(b) shows a times series of US 10-year bond monthly returns. The time series can be compared to the times series of S&P 500 daily returns in Figure 2.1(b), or to the simulated time series in Figure 3.29.

    Figure 3.1 shows empirical distribution functions of S&P 500 and US 10-year bond monthly returns: S&P 500 seems to have heavier tails than 10-year bond.

    Figure 3.3 shows smooth tail plots of the daily returns of S&P 500 components and of S&P 500 index: The individual components seem to have heavier tails than the index.

    Figure 3.8 shows empirical quantile functions of S&P 500 and US 10-year bond monthly returns: S&P 500 seems to have heavier tails than 10-year bond.

    Figure 3.10(a) shows kernel density estimates of S&P 500 and US 10-year bond monthly returns: These estimates do not reveal information about the tails, but in the central area 10-year bond seems to be more concentrated around zero than S&P 500. Cont (2001) reports that returns of US Treasury bonds are positively skewed, whereas the returns of stock indices are negatively skewed.

    Bouchaud (2002) reports that the tails of the stock returns have Pareto (power-law) tails c03-math-1191, where c03-math-1192 is approximately 3, but emerging markets can have c03-math-1193 smaller than 2. Cont (2001) notes that the tail index varies between 2 and 5 that excludes the Gaussian and the stable laws with infinite variance. The standard deviation of daily returns is 3% for stocks, 1% for stock indices, and 0.03% for short term interest rates; see Bouchaud (2002).

  3. 3. What is the best model for the tails?

    Figures 3.17–3.26 study fitting of parametric models to the tails of S&P 500 returns. In particular, tail plots are shown for the exponential distribution in Figure 3.18, for the Pareto distribution in Figure 3.20, for the gamma distribution in Figure 3.22, for the generalized Pareto distribution in Figure 3.24, and for the Weibull distribution in Figure 3.26. Two-parameter families give reasonable fits, in particular, the generalized Pareto distribution gives a good fit.

    Malevergne and Sornette (2005) give a review of fitting Pareto distributions, stretched exponentials and log-Weibull distributions.

  4. 4. Are the left tail and the right equally heavy?

    The parameter estimates for fitting models to the daily returns of S&P 500 indicate that the left tail is heavier than the right tail (see Figures 3.17, 3.19, 3.21, 3.23, and 3.25).

    Figure 3.28 shows values of estimates of Pareto tail index c03-math-1194 for S&P 500 components, both for the left and right tail: There seems to be equal amount of stocks with a larger left tail index as there are stocks with a larger right tail index.

    Cont (2001) reports that gains and loss are asymmetric; large drawdowns are observed but not equally large upward movements.

  5. 5. How the heaviness of the tails is affected by the return horizon?

    Figure 3.12 shows values of estimates of parameters of c03-math-1195-distribution (degrees of freedom c03-math-1196 and scaling parameter c03-math-1197) for various return horizons of S&P 500 returns: the estimates increase from c03-math-1198 for daily returns to c03-math-1199 for 2-month returns. Also c03-math-1200 increases with the return horizon.

    Figure 3.10(b) shows kernel density estimates of the S&P 500 return distribution when the return horizon varies between one and five days.

    Cont (2001) observes that the distribution of returns looks more and more like a Gaussian distribution when the time scale is increased.

3.2 equation

when c03-math-016, where c03-math-017 is the joint density of c03-math-018, and c03-math-019 is the density of c03-math-020:

equation

If c03-math-021, then c03-math-022.

equation

The absolute shortfall for the left tail is related to the lower partial moment of order c03-math-165 and target rate c03-math-166:

equation

The absolute shortfall is estimated from observations c03-math-167 by

equation

where c03-math-168 is the ordered sample and c03-math-169. Here, we divide by c03-math-170, but in the estimator (3.28) of the expected shortfall we divide by c03-math-171.

equation

where c03-math-319.

equation

where c03-math-380 and c03-math-381. Then the histogram can be written as

equation
equation

where c03-math-396, c03-math-397 is small, and c03-math-398 is the Lebesgue measure of c03-math-399. We have that

equation

when c03-math-400. We arrive into (3.43) by allowing other kernel functions than only the indicator function.

equation

where c03-math-517 is the sample variance, when we assume that c03-math-518 is known. Analogously, in simulations we have to note that when c03-math-519, then

equation

has mean c03-math-520 and variance c03-math-521.

equation

The logarithmic likelihood is

equation

Putting the derivative equal to zero and solving the equation gives the maximum likelihood estimator.

equation
equation

where it is assumed that c03-math-702 are i.i.d. Pareto distributed random variables. Taking logarithms leads to

equation

Differentiating with respect to c03-math-703 and setting the derivative equal to zero gives the maximum likelihood estimator.

equation
equation
3.90 equation

and

3.91 equation
equation

if c03-math-1162. Also, because c03-math-1163,

equation

if c03-math-1164. (We can argue that now c03-math-1165.) We have assumed that c03-math-1166.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset