Chapter 11
On the Estimation of the Distribution of Aggregated Heavy-Tailed Risks: Application to Risk Measures

Marie Kratz

ESSEC Business School, CREAR, Paris, France

AMS 2000 subject classification. 60F05; 62G32; 62G30; 62P05; 62G20; 91B30; 91G70.

11.1 Introduction

A universally accepted lesson of the last financial crisis has been the urgent need to improve risk analysis within financial institutions. Taking into account extreme risks is recognized nowadays as a necessary condition for good risk management in any financial institution and not restricted anymore to reinsurance companies. Minimizing the impact of extreme risks, or even ignoring them because of a small probability of occurrence, has been considered by many professionals and supervisory authorities as a factor of aggravation of the last financial crisis. The American Senate and the Basel Committee on Banking Supervision confirm this statement in their report. Therefore it became crucial to include and evaluate correctly extreme risks. It is our goal here, when considering a portfolio of heavy-tailed risks, notably when the tail risk is larger than 2, that is, when there is a finite variance. It is the case when studying not only financial assets but also insurance liabilities. It concerns life insurance as well, because of investment risks and interest rates; to have omitted them was at the origin of the bankruptcy of several insurance life companies as, for instance, Executive Life in the United States, Mannheimer in Germany, or Scottish Widows in the United Kingdom.

11.1.1 Motivation and Objective

When considering financial assets, because of a finite variance, a normal approximation is often chosen in practice for the unknown distribution of the yearly log returns, justified by the use of the central limit theorem (CLT), when assuming independent and identically distributed (i.i.d.) observations. Such a choice of modeling, in particular using light-tailed distributions, has shown itself grossly inadequate during the last financial crisis when dealing with risk measures because it leads to underestimating the risk.

Recently, a study was done by Furrer (2012) on simulated i.i.d. Pareto random variables (r.v.'s) to measure the impact of the choice and the use of the limiting distribution of aggregated risks, in particular for the computation of standard risk measures (value-at-risk or expected shortfall). In this study, the standard general central limit theorem (GCLT) (see, e.g., (Samorodnitsky and Taqqu, 1994)) is recalled, providing a limiting stable distribution or a normal one, depending on the value of the shape parameter of the Pareto r.v.'s. Then, considering Pareto samples of various sizes and for different values of the shape parameter, Furrer compared the distance between the empirical distribution and the theoretical limiting distribution; then computed the empirical value-at-risk (denoted VaR) and expected shortfall, called also tail value-at-risk (denoted ES or TVaR); and compared them with the ones computed from the limiting distribution. It appeared clearly that not only the choice of the limiting distribution but also the rate of convergence matters, hence the way of aggregating the variables. From this study, we also notice that the normal approximation appears really inadequate when considering aggregated risks coming from a moderately heavy-tailed distribution, that is, a Pareto with a shape parameter or tail index larger than 2, but below 4.

A few comments can be added to this study. First, the numerical results obtained in Furrer (2012) confirm what is already known in the literature. In particular, there are two main drawbacks when using the CLT for moderate heavy-tailed distributions (e.g., Pareto with a shape parameter larger than 2). On one hand, if the CLT may apply to the sample mean because of a finite variance, we also know that it provides a normal approximation with a very slow rate of convergence, which may be improved when removing extremes from the sample (see, e.g., (Hall, 1984)). Hence, even if we are interested only in the sample mean, samples of small or moderate sizes will lead to a bad approximation. To improve the rate of convergence, existence of moments of order larger than 2 is necessary (see, e.g., Section 3.2 in Embrechts et al. (1997) or, for more details, Petrov (1995)). On the other hand, we know that it has also been proved theoretically (see, e.g., (Pictet et al., 1998)) as well as empirically (see, e.g., (Dacorogna et al., 2001), Section 5.4.3) that the CLT approach applied to a heavy-tailed distributed sample does not bring any information on the tail and therefore should not be used to evaluate risk measures. Indeed, a heavy tail may appear clearly on high-frequency data (e.g., daily ones) but become not visible anymore when aggregating them in, for example, yearly data (i.e., short samples), although it is known, by Fisher theorem, that the tail index of the underlying distribution remains constant under aggregation. It is a phenomenon on which many authors insisted, as, for example, in Dacorogna et al. (2001). Figure 11.1 on the S&P 500 returns illustrate very clearly this last issue.

c11f001

Figure 11.1 (a) c11-math-0001 plot of the S&P 500 daily returns from 1987 to 2007, plotted against the Gaussian one (same scaling) that appears as a straight line. (b) c11-math-0002 plot of the S&P 500 monthly returns from 1987 to 2007, plotted against the Gaussian one (same scaling) that appears as a straight line.

Based on these figures above, the c11-math-0003 plot of the S&P 500 daily returns from 1987 to 2007 helps to detect a heavy tail. When aggregating the daily returns into monthly returns, the c11-math-0004 plot looks more as a normal one, and the very few observations appearing above the threshold of c11-math-0005, such as the financial crises of 1998 and 1987, could almost be considered as outliers, as it is well known that financial returns are symmetrically distributed.

Now, look at Figure 11.2. When adding data from 2008 to 2013, the c11-math-0006 plot looks pretty the same, that is, normal, except that another “outlier” appears ... with the date of October 2008! Instead of looking again on daily data for the same years, let us consider a larger sample of monthly data from 1791 to 2013.1 With a larger sample size, the heavy tail becomes again visible. And now we see that the financial crisis of 2008 does belong to the heavy tail of the distribution and cannot be considered anymore as an outlier. So we clearly see the importance of the sample size when dealing with moderately heavy tails to estimate the risk. Thus we need a method that does not depend on the sample size, but looks at the shape of the tail.

c11f002

Figure 11.2 (a) c11-math-0007 plot of the S&P 500 monthly returns from 1987 to 2013, plotted against the Gaussian one (same scaling) that appears as a straight line. (b) c11-math-0008 plot of the S&P 500 monthly returns from 1791 to 2013, plotted against the Gaussian one (same scaling) that appears as a straight line.

The main objective is to obtain the most accurate evaluation of the distribution of aggregated risks and of risk measures when working on financial data under the presence of fat tail. We explore various approaches to handle this problem, theoretically, empirically, and numerically. The application on log returns, which motivated the construction of this method, illustrates the case of time aggregation, but the method is general and concerns any type of aggregation, for example, of assets.

After reviewing briefly the existing methods, from the GCLT to extreme value theory (EVT), we will propose and develop two new methods, both inspired by the work of Zaliapin et al. (2005) in which the sum of c11-math-0009 i.i.d. r.v.'s is rewritten as the sum of the associated order statistics.

The first method, named Normex, answers the question of how many largest order statistics would explain the divergence between the underlying moderately heavy-tailed distribution and the normal approximation, whenever the CLT applies, and combines a normal approximation with the exact distribution of this number (independent of the size of the sample) of largest order statistics. It provides in general the sharpest results among the different methods, whatever the sample size is and for any heaviness of the tail.

The second method is empirical and consists of a weighted normal approximation. Of course, we cannot expect such a sharp result as the one obtained with Normex. However it provides a simple tool allowing to remain in the Gaussian realm. We introduce a shift in the mean and a weight in the variance as correcting terms for the Gaussian parameters.

Then we will proceed to an analytical comparison between the exact distribution of the Pareto sum and its approximation given by Normex before turning to the application of evaluating risk measures.

Finally a numerical study will follow, applying the various methods on simulated samples to compare the accuracy of the estimation of extreme quantiles, used as risk measures in solvency calculation.

In the rest of the chapter, with financial/actuarial applications in mind, and without loss of generality, we will use power law models for the marginal distributions of the risks such as the Pareto distribution.

11.1.2 Preliminaries

11.1.2.1 Main notation

c11-math-0010 will denote the integer part of any nonnegative real c11-math-0011 such that c11-math-0012.

Let c11-math-0013 be the probability space on which we will be working.

Let c11-math-0014 and c11-math-0015 denote, respectively, the cumulative distribution function (cdf) and the probability density function (pdf) of the standard normal distribution c11-math-0016 and c11-math-0017 and c11-math-0018 the cdf and pdf of the normal distribution c11-math-0019 with mean c11-math-0020 and variance c11-math-0021.

Let c11-math-0022 be a random variable (r.v.) Pareto (type I) distributed with shape parameter c11-math-0023, pdf denoted by c11-math-0024 and cdf c11-math-0025 defined by

and probability density function (pdf) denoted by c11-math-0027.

Note that the inverse function c11-math-0028 of c11-math-0029 is given by

Recall that for c11-math-0031, c11-math-0032 and for c11-math-0033, c11-math-0034.

We will consider i.i.d. Pareto r.v.'s in this study and denote by c11-math-0035 the Pareto sum c11-math-0036 c11-math-0037 being an c11-math-0038-sample with parent r.v. c11-math-0039 and associated order statistics c11-math-0040.

When dealing with financial assets (market risk data), we define the returns as

equation

c11-math-0042 being the daily price and c11-math-0043 representing the aggregation factor.

Note that we can also write

equation

In what follows, we will denote c11-math-0045 by c11-math-0046.

Further comments or questions

  • Is it still worth considering i.i.d. r.v.'s, whereas most recent research focus on dependent ones?

    Concerning the i.i.d. condition, note that this study fills up a gap in the literature on the sum of i.i.d. moderate heavy r.v.'s (see, e.g., (Feller, 1966); (Hahn et al., 1991), and (Petrov, 1995)). Moreover, in our practical example of log returns (the motivation of this work), the independence condition is satisfied (see, e.g., (Taylor, 1986); (Dacorogna et al., 2001)) and hence is not a restriction in this case of time aggregation.

    Another theoretical reason comes from the EVT; indeed we know that the tail index of the aggregated distribution corresponds to the one of the marginal with the heaviest tail and hence does not depend on considering the issue of dependence.

    Finally, there was still mathematically a missing “brick” when studying the behavior of the sum of i.i.d. r.v.'s with a moderately heavy tail, for which the CLT applies (for the center of distribution!) but with a slow convergence for the mean behavior and certainly does not provide satisfactory approximation for the tail. With this work, we aim at filling up the gap by looking at an appropriate limit distribution.

  • Why considering Pareto distribution?

    It is justified by the EVT (see, e.g., (Leadbetter et al., 1983); (Resnick, 1987), and (Embrechts et al., 1997)). Indeed recall the Pickands theorem (see (Pickands, 1975) for the seminal work) proving that for sufficiently high threshold c11-math-0047, the generalized Pareto distribution (GPD) c11-math-0048 (with tail index c11-math-0049 and scale parameter c11-math-0050) is a very good approximation to the excess cdf of a r.v. c11-math-0051 defined by c11-math-0052:

    equation

    if and only if the distribution of c11-math-0054 is in the domain of attraction of one of the three limit laws. When considering risks under the presence of heavy tail, it implies that the extreme risks follow a GPD with a positive tail index (called also extreme value index) c11-math-0055, which corresponds to say that the risks belong to the Fréchet maximum domain of attraction (see, e.g., (Galambos, 1978); (Leadbetter et al., 1983); (Resnick, 1987), or (Embrechts et al., 1997)). In particular, for c11-math-0056,

    for some constant c11-math-0058. It is then natural and quite general to consider a Pareto distribution (with shape parameter c11-math-0059) for heavy-tailed risks.

    A natural extension would then be considering r.v.'s with other distributions belonging to the Fréchet maximum domain of attraction.

  • A last remark concerns the parameter c11-math-0060 that we consider as given in our study. A prerequisite, when working on real data, would be to estimate c11-math-0061. Recall that there are various ways to test the presence of a heavy tail and to estimate the tail index, for example, the Hill estimator (see (Hill, 1975)) or the c11-math-0062-estimator (see (Kratz and Resnick, 1996)) (see also (Huston McCulloch, 1997); (Beirlant et al., 2004); (Resnick, 2006), and references therein). We will not provide an inventory of the methods, except a brief recall in the next section of an important empirical EVT method used for estimating the heaviness of the tail. Let us also mention a test, easy to use in practice, for the existence of fat tails, namely, the scaling law (see (Dacorogna et al., 2001), Section 5.5). It consists of comparing the two plots, for c11-math-0063 and 2, respectively, of c11-math-0064, c11-math-0065; if the scaling exponent for c11-math-0066 is larger than for c11-math-0067, then it is a sign of the existence of a fat tail. For financial data, there are numerous empirical studies that show the existence of fat tail and that the shape parameter is between 2 and 4 for developed markets (see, e.g., (Jansen and De Vries, 1991); (Longin, 1996); (Müller et al., 1998); (Dacorogna2 et al., 2001); (Dacorogna et al., 2001), and references therein).

11.2 A Brief Review of Existing Methods

Limit theorems for the sum of i.i.d. r.v.'s are well known. Nevertheless, they can be misused in practice for various reasons such as a too small sample size, as we have seen. As a consequence, it leads to wrong estimations of the risk measures for aggregated data. To help practitioners to be sensitive to this issue, we consider the simple example of aggregated heavy-tailed risks, where the risks are represented by i.i.d. Pareto r.v.'s. We start by reviewing the existing methods, from the GCLT to EVT, before applying them on simulated Pareto samples to show the pros and cons of those methods.

11.2.1 A GCLT Approach

  • For sake of completeness, let us recall the GCLT (see, e.g., (Samorodnitsky and Taqqu, 1994); (Nolan, 2012)), which states that the properly normalized sum of a large number of i.i.d. r.v.'s belonging to the domain of attraction of an c11-math-0068-stable law may be approximated by a stable distribution with index c11-math-0069 (c11-math-0070).
  • It applies in our specific case where we consider i.i.d. Pareto r.v.'s with shape parameter c11-math-0097 and we can identify the normalizing constants. We have
    11.4 equation

    with

    equation

    Note that the tail distribution of c11-math-0101 satisfies (see (Samorodnitsky and Taqqu, 1994)):

    11.6 equation

11.2.2 An EVT Approach

When focusing on the tail of the distribution, in particular for the estimation of the risk measures, the information on the entire distribution is not necessary, hence the alternative of the EVT approach.

Recall the Fisher–Tippett theorem (see (Fisher and Tippett, 1928)) which states that the limiting distribution for the rescaled sample maximum can only be of three types: Fréchet, Weibull, and Gumbel. The three types of extreme value distribution have been combined into a single three-parameter family ((Jenkinson, 1955); (von Mises, 1936); 1985) known as the generalized extreme value (GEV) distribution given by

equation

with c11-math-0104 (scale parameter), c11-math-0105 (location parameter), and c11-math-0106 (tail index or extreme value index). The tail index c11-math-0107 determines the nature of the tail distribution:

c11-math-0108: Fréchet, c11-math-0109: Gumbel, c11-math-0110: Weibull.

Under the assumption of regular variation of the tail distribution, the tail of the cdf of the sum of i.i.d. r.v.'s is mainly determined by the tail of the cdf of the maximum of these r.v.'s. Indeed, we have the following lemma.

It applies of course to Pareto r.v.'s.

Combining (11.8) with the GEV limiting distribution in the case of c11-math-0117-Pareto r.v.'s provides that the tail distribution of the rescaled sum c11-math-0118 of Pareto r.v.'s is asymptotically Fréchet:

where c11-math-0120 is defined as in (2.1).

11.3 New Approaches: Mixed Limit Theorems

An alternative approach to the GCLT one has been proposed by Zaliapin et al. (see (Zaliapin et al., 2005)) when the Pareto shape parameter satisfies c11-math-0121, a case where the variance of the Pareto r.v.'s c11-math-0122 does not exist. The neat idea of the method is to rewrite the sum of the c11-math-0123's as the sum of the order statistics c11-math-0124 and to separate it into two terms, one with the first c11-math-0125 order statistics having finite variance and the other as the complement

equation

They can then treat these two subsums separately. Even if not always rigorously developed in this paper, or, say, quite approximative, as we will see later, their method provides better numerical results than the GCLT does for any number of summands and any quantile. Nevertheless, there are some mathematical issues in this paper. One of them is that the authors consider these two subsums as independent. Another one is that they approximate the quantile of the total (Pareto) sum with the direct summation of the quantiles of each subsum, although the quantiles are not additive. For the case c11-math-0127, they reduce the behavior of the sum arbitrarily to the last two upper order statistics.

Another drawback of this method would be, when considering the case c11-math-0128, to remain with one sum of all terms with a finite variance, hence in general with a poor or slow normal approximation.

We are mainly interested in the case of a shape parameter larger than 2, since it is the missing part in the literature and of practical relevance when studying market risk data, for instance. For such a case, the CLT applies because of the finiteness of the second moment, but using it to obtain information on something else than the average is simply wrong in presence of fat tails, even if in some situations (e.g., when working on aggregated data or on short samples), the plot of the empirical distribution fits a normal one. The CLT only concentrates on the mean behavior; it is equivalent to the CLT on the trimmed sum (i.e., c11-math-0129 minus a given number of the largest order statistics (or tail)) (see (Mori, 1984)), for which the rate of convergence improves (see, e.g., (Hahn et al., 1991); (Hall, 1984)).

Inspired by Zaliapin et al.'s paper, we go further in the direction of separating mean and extreme behaviors in order to improve approximations, for any c11-math-0130, and we build two alternative methods, called Normex and the weighted normal limit, respectively. It means to answer rigorously the question of how many largest order statistics c11-math-0131, c11-math-0132 would explain the divergence between the underlying distribution and the normal approximation when considering a Pareto sum with c11-math-0133 or the stable approximation when considering a Pareto sum with c11-math-0134.

Both methods rely initially on Zaliapin et al.'s approach of splitting the Pareto sum into a trimmed sum to which the CLT applies and another sum with the remaining largest order statistics. The main idea of the two methods is to determine in an “optimal way” (in order to improve at most the distribution approximation), which we are going to explain, the number c11-math-0135 that corresponds to a threshold when splitting the sum of order statistics into two subsums, with the second one constituted by the c11-math-0136 largest order statistics. We will develop these methods under realistic assumptions, dropping in particular Zaliapin's et al.'s assumption of independence between the two subsums. Our two methods differ from each other in two points:

  • The way of selecting this number c11-math-0137.
  • The way of evaluating the sum determined by the c11-math-0138 largest order statistics, which is of course related to the choice of c11-math-0139

Our study is developed on the Pareto example, but its goal is to propose a method that may be applied to any heavy-tailed distribution (with positive tail index) and to real data, hence this choice of looking for limit theorems in order to approximate the true (and most of the time unknown) distribution.

11.3.1 A Common First Step

11.3.1.1 How to fit for the best mean behavior of aggregated heavy-tailed distributed risks?

Let us start by studying the behavior of the trimmed sum c11-math-0140 when writing down the sum c11-math-0141 of the i.i.d. c11-math-0142-Pareto r.v.'s (with c11-math-0143), c11-math-0144, as

Much literature, since the 1980s, has been concerned with the behavior of trimmed sums by removing extremes from the sample; see, for example, Hall (1984), Mori (1984), and Hahn et al. (1991).

The main issue is the choice of the threshold c11-math-0146, in order to use the CLT but also to improve its fit since we want to approximate the behavior of c11-math-0147 by a normal one.

We know that a necessary and sufficient condition for the CLT to apply on c11-math-0148 is to require the summands c11-math-0149, c11-math-0150, to be c11-math-0151-r.v.'s. But we also know that requiring only the finitude of the second moment may lead to a poor normal approximation, if higher moments do not exist, as occurs, for instance, with financial market data. In particular, including the finitude of the third moment provides a better rate of convergence to the normal distribution in the CLT (Berry–Esséen inequality). Another information that might be quite useful to improve the approximation of the distribution of c11-math-0152 with its limit distribution is the Fisher index, defined by the ratio c11-math-0153, which is a kurtosis index. The skewness c11-math-0154 of c11-math-0155 and c11-math-0156 measures the closeness of the cdf c11-math-0157 to c11-math-0158. Hence we will choose c11-math-0159 based on the condition of existence of the fourth moment of the summands of c11-math-0160(i.e., the first c11-math-0161 order statistics).

The following Edgeworth expansion involving the Hermite polynomials c11-math-0162 points out that requiring the finitude of the fourth moments appears as what we call the “optimal” solution (of course, the higher order moments exist, the finer the normal approximation becomes, but it would imply too strong conditions and difficult to handle). If c11-math-0163 denotes the cdf of the standardized c11-math-0164 defined by c11-math-0165, then

uniformly in c11-math-0167, with

equation

The rate of convergence appears clearly as c11-math-0169 whenever c11-math-0170, c11-math-0171.

Note that in our Pareto case, the skewness and the excess kurtosis are, respectively,

equation

Therefore we set c11-math-0173 (but prefer to keep the notation c11-math-0174 so that it remains general) to obtain what we call an “optimal” approximation. Then we select the threshold c11-math-0175 such that

equation

which when applied to our case of c11-math-0177-Pareto i.i.d. r.v.'s (using (11.17)) gives

This condition allows then to determine a fixed number c11-math-0179 as a function of the shape parameter c11-math-0180 of the underlying heavy-tailed distribution of the c11-math-0181's but not of the size c11-math-0182 of the sample. We can take it as small as possible in order to fit for the best both the mean and tail behaviors of c11-math-0183. Note that we look for the smallest possible c11-math-0184 to be able to compute explicitly the distribution of the last upper order statistics appearing as the summands of the second sum c11-math-0185. For this reason, based on condition (11.12), we choose

Let us summarize in Table 11.1 the necessary and sufficient condition on c11-math-0187(for c11-math-0188) to have the existence of the c11-math-0189th moments for the upper order statistics for c11-math-0190, respectively, using (11.12) written as c11-math-0191.

Table 11.1 Necessary and sufficient condition on c11-math-0192 for having c11-math-0193, c11-math-0194

c11-math-0195 0 1 2 3 4 5 6 7
c11-math-0196 c11-math-0197 c11-math-0198 c11-math-0199 c11-math-0200 c11-math-0201 c11-math-0202 c11-math-0203 c11-math-0204
c11-math-0205 c11-math-0206 c11-math-0207 c11-math-0208 c11-math-0209 c11-math-0210 c11-math-0211 c11-math-0212 c11-math-0213
c11-math-0214 c11-math-0215 c11-math-0216 c11-math-0217 c11-math-0218 c11-math-0219 c11-math-0220 c11-math-0221 c11-math-0222

We deduce the value of the threshold c11-math-0223 satisfying (11.13) for which the fourth moment is finite according to the set of definition of c11-math-0224: We notice from Table 11.2 that we would use Zaliapin et al.'s decomposition only when c11-math-0225. When considering, as they do, c11-math-0226, we would rather introduce the decomposition c11-math-0227, with c11-math-0228 varying from 2 to 5 depending on the value of c11-math-0229, to improve the approximation of the distribution of c11-math-0230, if we omit the discussion on their conditions.

Table 11.2 Value of c11-math-0231 for having up to c11-math-0232

c11-math-0233 with c11-math-0234 c11-math-0235 c11-math-0236 c11-math-0237 c11-math-0238 c11-math-0239 c11-math-0240 ]2,4]
c11-math-0241 = 7 6 5 4 3 2 1

11.3.1.2 Some properties of order statistics for Pareto random variables

First we apply known results on distribution of order statistics (see, e.g., (David and Nadaraja, 2003)) when considering Pareto distributions. Next we compute conditional distributions of order statistics, as well as conditional moments, to apply them to the Pareto case.

  • Distribution of Pareto order statistics

    For c11-math-0242-Pareto r.v.'s, the pdf c11-math-0243 of c11-math-0244 (c11-math-0245) and the pdf c11-math-0246 of the order statistics c11-math-0247, c11-math-0248 (c11-math-0249), with c11-math-0250, are expressed, respectively, as

    and, for c11-math-0252,

    When considering successive order statistics, for c11-math-0254, for c11-math-0255, c11-math-0256, with c11-math-0257, we obtain

    Moments of c11-math-0259-Pareto order statistics satisfy (see also, e.g., (Zaliapin et al., 2005); Theorem 1)

    and, for c11-math-0261,

    equation
  • Conditional distribution of order statistics. Application to Pareto r.v.'s

    Now straightforward computations lead to new properties that will be needed to build Normex. We express them in the general case (and labeled), with the notation c11-math-0263 and c11-math-0264, and then for Pareto r.v.'s.

    We deduce from (11.14) and (11.15) that the pdf of c11-math-0265 given c11-math-0266, for c11-math-0267, is, for c11-math-0268,

    and that the joint pdf of c11-math-0270 given c11-math-0271, for c11-math-0272, is, for c11-math-0273,

    Using (11.14) and (11.16) provides, for c11-math-0275,

    Then we can compute the first conditional moments. We obtain, using (11.18) and the change of variables c11-math-0277,

    11.22 equation

    For c11-math-0280, via (11.19) and the change of variables c11-math-0281 and c11-math-0282, it comes

    Moreover, the joint conditional distribution of c11-math-0284 given c11-math-0285, for c11-math-0286, denoted by c11-math-0287, or c11-math-0288 when no ambiguity exists, is, for c11-math-0289,

    11.24 equation

    from which we get back the well-known result that c11-math-0291 are independent of c11-math-0292 and c11-math-0293 when c11-math-0294 and c11-math-0295 are given and that the order statistics form a Markov chain.

11.3.2 Method 1: Normex—A Mixed Normal Extreme Limit

(see (Kratz, 2014))

11.3.2.1 A conditional decomposition

Whatever the size of the sample is, because of the small magnitude of c11-math-0296, we are able to compute explicitly the distribution of the last upper order statistics appearing as the summands of the second sum c11-math-0297 defined in (11.10). The choice of c11-math-0298 allows also to obtain a good normal approximation for the distribution of the trimmed sum c11-math-0299. Nevertheless, since c11-math-0300 and c11-math-0301 are not independent, we decompose the Pareto sum c11-math-0302 in a slightly different way than in (11.10) (but keeping the same notation), namely,

and use the property of conditional independence (recalled in Section 11.1.2) between the two subsums c11-math-0304 and c11-math-0305 conditional on c11-math-0306(for c11-math-0307).

Then we obtain the following approximation of the distribution of c11-math-0308, for c11-math-0309(i.e., when the c11-math-0310th moment of the c11-math-0311 largest order statistics does not exist).

Comments

  1. The distribution c11-math-0326 can also be expressed as

    equation

    and, for c11-math-0328

    equation

    where the convolution product c11-math-0330 can be numerically evaluated using either the recursive convolution equation c11-math-0331, for c11-math-0332, (it will be fast, c11-math-0333 being small) and c11-math-0334 or, if c11-math-0335, the explicit expression (12) (replacing c11-math-0336 by c11-math-0337) given in Ramsay (2006).

  2. Note that we considered i.i.d. Pareto r.v.'s only as an example to illustrate our method intended to be extended to unknown distributions using the CLT for the mean behavior and heavy-tailed distributions of the Pareto type for the tail. Since the exact distribution of the Pareto sum c11-math-0338 of i.i.d. Pareto r.v.'s is known, we will be able to judge about the quality of the approximation proposed in Theorem 11.2 when comparing c11-math-0339 with the exact distribution of c11-math-0340. We will then compare the respective associated risk measures.
  3. Finally recall the following result by Feller (see 1966) on the convolution closure of distributions with regularly varying tails, which applies in our Pareto example but may also be useful when extending the method to distributions belonging to the Fréchet maximum domain of attraction.

Note that this lemma implies the result given in (11.7), and as a consequence in the Pareto case, we have

equation

11.3.2.2 On the quality of the approximation of the distribution of the Pareto sum c11-math-0416

To estimate the quality of the approximation of the distribution of the Pareto sum c11-math-0417, we compare analytically the exact distribution of c11-math-0418 with the distribution c11-math-0419 defined in Theorem 11.2. It could also be done numerically, as, for instance, in Furrer (2012) with the distance between two distributions c11-math-0420 and c11-math-0421 defined by c11-math-0422, with c11-math-0423. We will proceed numerically only when considering the tail of the distributions and estimating the distance in the tails through the VaR measure (see Section 11.4.3). When looking at the entire distributions, we will focus on the analytical comparison mainly for the case c11-math-0424 (with some hints for the case c11-math-0425). Note that it is not possible to compare directly the expressions of the VaR corresponding to, respectively, the exact and approximative distributions, since they can only be expressed as the inverse function of a cdf. Nevertheless, we can compare the tails of these two distributions to calibrate the accuracy of the approximative VaR since

equation

Moreover, we will compare analytically our result with a normal approximation made on the entire sum (and not the trimmed one) since, for c11-math-0427, the CLT applies and, as already noticed, is often used in practice.

Since Normex uses the exact distribution of the last upper order statistics, comparing the true distribution of c11-math-0428 with its approximation c11-math-0429 simply comes back to the comparison of the true distribution of c11-math-0430 i.i.d. r.v.'s with the normal distribution (when applying the CLT). Note that, when extending Normex to any distribution, an error term should be added to this latter evaluation; it comes from the approximation of the extreme distribution by a Pareto one.

Suppose c11-math-0431. Applying the CLT gives the normal approximation c11-math-0432, with c11-math-0433 and c11-math-0434, where in the case of a Pareto sum, c11-math-0435, and c11-math-0436. We know that applying the CLT directly to c11-math-0437 leads to nonsatisfactory results even for the mean behavior, since, for any c11-math-0438, the quantity c11-math-0439, involving the third moment of c11-math-0440 and appearing in the error (11.11) made when approximating the exact distribution of c11-math-0441 by a normal one, is infinite for any c11-math-0442. The rate of convergence in c11-math-0443 is reduced to c11-math-0444. When c11-math-0445, even if the rate of convergence improves because c11-math-0446, we still have c11-math-0447 (because the fourth moment of c11-math-0448 does not exist), which means that we cannot get a rate of order c11-math-0449.

Now let us look at the rate of convergence when approximating c11-math-0450 with c11-math-0451.

Considering the exact distribution of the Pareto sum c11-math-0452 means taking, at given c11-math-0453 and for any c11-math-0454, c11-math-0455 with c11-math-0456 i.i.d. r.v.'s with parent r.v. c11-math-0457 with finite c11-math-0458th moment and pdf c11-math-0459 defined, for c11-math-0460, by

equation

Let us look at the three first moments of c11-math-0462. The direct dependence is on c11-math-0463 (and c11-math-0464) and indirectly on c11-math-0465 since c11-math-0466. We have

(note that c11-math-0468, for any c11-math-0469 that we consider, and any c11-math-0470) and

using the expressions of c11-math-0472 and c11-math-0473 given in Theorem 11.2. A straightforward computation of the third centered moment of c11-math-0474 provides

11.33 equation

where c11-math-0476 denotes the antiderivative of the function c11-math-0477, that is, if c11-math-0478,

equation

whereas, if c11-math-0480,

equation

and, if c11-math-0482,

equation

For simplicity, let us look at the case c11-math-0484 and consider the Berry–Esséen inequality. For c11-math-0485, we would use the Edgeworth expansion, with similar arguments as developed later. Various authors have worked on this type of Berry–Esséen inequality, in particular to sharpen the accuracy of the constant appearing in it. In the case of Berry–Esséen bounds, the value of the constant factor c11-math-0486 has decreased from 7.59 by Esséen (1942) to 0.4785 by Tyurin (2010), to 0.4690 by Shevtsova (2013) in the i.i.d. case, and to 0.5600 in the general case. Note also that these past decades, much literature ((Stein, 1972, 1986); (Chen and Shao, 2004); (Cai, 2012); (Pinelis, 2013), etc.) has been dedicated to the generalization of this type of inequality, such as the remarkable contribution by Stein.

We can propose the following bound.

Note that c11-math-0514 depends on the two parameters c11-math-0515 and c11-math-0516. We represent this function on a same plot for a given value of c11-math-0517 but for various values of c11-math-0518, namely, c11-math-0519, and 1000, respectively, to compare its behavior according to the parameter c11-math-0520. Then we repeat the operation for different c11-math-0521, namely, for c11-math-0522, respectively (Figure 11.3).

c11f003

Figure 11.3 Plots of the function c11-math-0538 defined in (11.35), for various c11-math-0539, at given c11-math-0540. (a) Case c11-math-0541. (b) Case c11-math-0542. (c) Case c11-math-0543.

We observe that the bound c11-math-0523 is an increasing then decreasing function of c11-math-0524, with a maximum less than c11-math-0525, which is decreasing with c11-math-0526 and c11-math-0527. The c11-math-0528-coordinate of the maximum is proportional to c11-math-0529, with the proportion decreasing with c11-math-0530. The interval on the c11-math-0531-axis for which the error is larger than c11-math-0532 has a small amplitude, which is decreasing with c11-math-0533.

We show in Table 11.3 the values of the coordinates c11-math-0534 of the maximum of c11-math-0535 computed on R for c11-math-0536 and c11-math-0537 (corresponding to aggregating weekly returns to obtain yearly returns), 100, 250 (corresponding to aggregating daily returns to obtain yearly returns), 500, 1000, respectively.

Table 11.3 Coordinates c11-math-0544 of the maximum of c11-math-0545(defined in (11.35)), as a function of c11-math-0546 and c11-math-0547

c11-math-0548 c11-math-0549 c11-math-0550
c11-math-0551 c11-math-0552 c11-math-0553 c11-math-0554 c11-math-0555 c11-math-0556 c11-math-0557
52 101 4.9 86 4.9 78 4.9
100 196 4.6 166 4.6 150 4.6
250 494 4.2 417 4.1 376 4.0
500 990 3.9 834 3.7 751 3.5
1000 1984 3.6 1667 3.3 1501 3.0

Hence the result of Proposition 11.2.

Indeed, we have

equation

Note that the Berry–Esséen inequality has been proved by Petrov to hold also for probability density functions (see (Petrov, 1956) or (Petrov, 1995)). It has been refined by Shevtsova (2007), and we will use her result to evaluate c11-math-0566. We need to go back to the pdf of the standardized sum c11-math-0567 of i.i.d. r.v.'s with pdf c11-math-0568, which can be expressed as

equation

It is straightforward to show by induction that

equation

Then, since c11-math-0571, we can write

Since we consider a sum of c11-math-0573 i.i.d. r.v.'s c11-math-0574(c11-math-0575) with parent r.v. c11-math-0576 having a finite c11-math-0577th moment, we obtain via (Petrov, 1956) and (Shevtsova, 2007) that there exists a constant c11-math-0578 such that

where c11-math-0580 is defined in (11.34).

Hence, combining (11.36) and (11.37) gives

equation

from which we deduce that

equation

As in the case c11-math-0583 (Proposition 11.2), this bound could be computed numerically.

11.3.3 Method 2: A Weighted Normal Limit

In this method, we go back to the first decomposition (11.10) of c11-math-0584 and use limit theorems for both terms c11-math-0585 and c11-math-0586 instead of proceeding via conditional independence and considering a small given c11-math-0587. It means that we need to choose c11-math-0588 as a function of c11-math-0589 such that c11-math-0590 as c11-math-0591 for the approximation of the distribution of c11-math-0592 via its limit to be relevant.

First we consider a normal approximation for the trimmed sum c11-math-0593, which implies some conditions on the threshold c11-math-0594 (see (Csörgö et al., 1986)). We need to select a threshold c11-math-0595 such that

Note that the condition (11.12) will be implied by the condition c11-math-0597. Hence, for this method, c11-math-0598 does not depend directly on the value of c11-math-0599.

We can then enunciate the following.

Note that c11-math-0614 is chosen in such a way that c11-math-0615 is finite. The case c11-math-0616 corresponds to the one developed in Zaliapin et al. (but with a different set of definition for c11-math-0617).

Let us turn now to the limit behavior of the partial sum c11-math-0639. The main idea of this method relies on using an estimation (involving the last order statistics) of the expected shortfall c11-math-0640 of c11-math-0641 defined for an c11-math-0642-Pareto r.v. by c11-math-0643, c11-math-0644 being the confidence level (see Section 11.4.1), in order to propose an approximation for the second term c11-math-0645. So it implies to assume c11-math-0646, that is, c11-math-0647.

Let us recall the following result (see (Acerbi and Tasche, 2002) for the proof or (Embrechts et al., 1997)) that we are going to use.

In other words, expected shortfall at confidence level c11-math-0657 can be thought of as the limiting average of the c11-math-0658 upper order statistics from a sample of size c11-math-0659 from the loss distribution.

Now we can enunciate the main empirical result.

Comments

  1. This result is interesting since it shows that, even if we want to consider a normal approximation, there must be a correction based on c11-math-0709 and the number of extremes that we considered, such that both the mean and the variance become larger than the ones of c11-math-0710.
  2. With the final approximation being normal, its tail distribution is light; hence we do not expect a priori an evaluation of the VaR as accurate as the one provided by Normex, but better than the normal one applied directly on c11-math-0711. The light tail should still lead to an underestimation of the VaR, but certainly not as gross as the one when applying directly the CLT, because of the correcting term expressed in terms of the ES.
  3. We will compare numerically not only the tail approximation with the exact one but also the modified normal approximation with the normal one made directly on c11-math-0712.
  4. To obtain a good fit requires a calibration of c11-math-0713. Numerically, it appeared that the value c11-math-0714 provides a reasonable fit, for any c11-math-0715 and any c11-math-0716. It is an advantage that c11-math-0717 does not have to be chosen differently, depending on these parameters c11-math-0718 and c11-math-0719, in order to keep the generality of the method. The next research step will consist in the analytical evaluation of this method and to generalize it, if possible, to any c11-math-0720.

11.4 Application to Risk Measures and Comparison

11.4.1 Standard Risk Measures Based on Loss Distribution

Variance and standard deviation were historically the dominating risk measures in finance. However, they require the underlying distribution to have a finite second moment and are appropriate for symmetric distributions. Because of this restricted frame, they have often been replaced in practical applications by VaR, which was, until recently, the most popular downside risk measure in finance. VaR started to be criticized for a number of different reasons. Most important are its lack of the subadditivity property and the fact that it completely ignores the severity of losses in the far tail of the loss distribution. The coherent risk measure expected shortfall was introduced to solve these issues. Two years ago, ES has been shown not to be elicitable (Gneiting, 2012). Hence the search, meantime, of coherent and elicitable alternatives, as, for instance, expectiles (Bellini et al., 2013); (Ziegel, 2014). Properties of these popular risk measures, like coherence, comonotonic additivity, robustness, and elicitability, as well as their impact on important issues in risk management like diversification benefit and capital allocation, have been discussed in a recent paper (Emmer et al., 2015).

Here we are going to consider only the risk measures used in solvency calculations (the other risk measures would be treated in the same way), namely, the value-at-risk, denoted VaR, and the expected shorfall (named also tail value-at-risk) ES (or TVaR), of an r.v. c11-math-0721 with continuous cdf c11-math-0722 (and inverse function denoted by c11-math-0723):

  • The value-at-risk of order c11-math-0724 of c11-math-0725 is simply the quantile of c11-math-0726 of order c11-math-0727, c11-math-0728: equation
  • If c11-math-0730, the expected shortfall (ES) at confidence level c11-math-0731 is defined as equation

We will simplify the notation of those risk measures writing c11-math-0733 or c11-math-0734 when no confusion is possible.

Note that, in the case of an c11-math-0735-Pareto distribution, analytical expressions of those two risk measures can be deduced from (11.2), namely,

equation

Recall also that the shape parameter c11-math-0737 totally determines the ratio c11-math-0738 when we go far enough out into the tail:

equation

Note that this result holds also for the GPD with shape parameter c11-math-0740.

When looking at aggregated risks c11-math-0741, it is well known that the risk measure ES is coherent (see (Artzner et al., 1999)). In particular it is subadditive, that is,

equation

whereas VaR is not a coherent measure, because it is not subadditive. Indeed many examples can be given where VaR is superadditive, that is,

equation

see, e.g., Embrechts et al., (2009), Daníelsson et al., (2005).

In the case of c11-math-0754-Pareto i.i.d. r.v.'s, the risk measure VaR is asymptotically superadditive (subadditive, respectively) if c11-math-0755(c11-math-0756, respectively).

Recently, numerical and analytical techniques have been developed in order to evaluate the risk measures VaR and ES under different dependence assumptions regarding the loss r.v.'s. It certainly helps for a better understanding of the aggregation and diversification properties of risk measures, in particular of noncoherent ones such as VaR. We will not review these techniques and results in this report, but refer to Embrechts et al. (2013) for an overview and references therein. Let us add to those references some recent work by Mikosch and Wintenberger (2013) on large deviations under dependence which allows an evaluation of VaR. Nevertheless, it is worth mentioning a new numerical algorithm that has been introduced by Embrechts et al., (2013), which allows for the computation of reliable lower and upper bounds for the VaR of high-dimensional (inhomogeneous) portfolios, whatever the dependence structure is.

11.4.2 Possible Approximations of VaR

As an example, we treat the case of one of the two main risk measures and choose the VaR, since it is the main one used for solvency requirement. We would proceed in the same way for the expected shortfall.

It is straightforward to deduce, from the various limit theorems, the approximations c11-math-0757 of the VaR of order c11-math-0758 of the aggregated risks, c11-math-0759, that is, the quantile of order c11-math-0760 of the sum c11-math-0761 defined by c11-math-0762. The index c11-math-0763 indicates the chosen method, namely, (i) for the GCLT approach, (ii) for the CLT one, (iii) for the max one, (iv) for the Zaliapin et al.'s method, (v) for Normex, and (vi) for the weighted normal limit. We obtain the following:

  1. c11-math-0764 Via the GCLT, for c11-math-0765:
    equation
  2. c11-math-0767 Via the CLT, for c11-math-0768(see (11.5)):
    equation
  3. c11-math-0770 Via the Max (EVT) approach, using (11.9), for high-order c11-math-0771, for any positive c11-math-0772,
    equation
  4. c11-math-0774 Via the Zaliapin et al.'s method (Zaliapin et al., 2005), for c11-math-0775:
    equation
  5. c11-math-0777 Via Normex, for any positive c11-math-0778, and c11-math-0779 satisfying (11.13):
    equation
  6. c11-math-0781 Via the weighted normal limit, for c11-math-0782 (see (11.47)):
    equation

11.4.3 Numerical Study: Comparison of the Methods

Since there is no explicit analytical formula for the true quantiles of c11-math-0784, we will complete the analytical comparison of the distributions of c11-math-0785 and c11-math-0786 given in Section 11.3.2.2, providing here a numerical comparison between the quantile of c11-math-0787 and the quantiles obtained by the various methods seen so far.

Nevertheless, in the case c11-math-0788, we can compare analytically the VaR obtained when doing a rough normal approximation directly on c11-math-0789, namely, c11-math-0790, with the one obtained via the shifted normal method, namely, c11-math-0791. So, we obtain the correcting term to the CLT as

equation

11.4.3.1 Presentation of the study

We simulate c11-math-0793 with parent r.v. c11-math-0794 c11-math-0795-Pareto distributed, with different sample sizes, varying from c11-math-0796 (corresponding to aggregating weekly returns to obtain yearly returns) through c11-math-0797 (corresponding to aggregating daily returns to obtain yearly returns) to c11-math-0798 representing a large size portfolio.

We consider different shape parameters, namely, c11-math-0799, respectively. Recall that simulated Pareto r.v.'s c11-math-0800's (c11-math-0801 can be obtained simulating a uniform r.v. c11-math-0802 on c11-math-0803 and then applying the transformation c11-math-0804.

For each c11-math-0805 and each c11-math-0806, we aggregate the realizations c11-math-0807's (c11-math-0808). We repeat the operation c11-math-0809 times, thus obtaining c11-math-0810 realizations of the Pareto sum c11-math-0811, from which we can estimate its quantiles.

Let c11-math-0812 denote the empirical quantile of order c11-math-0813 of the Pareto sum c11-math-0814(associated with the empirical cdf c11-math-0815 and pdf c11-math-0816) defined by

equation

Recall, for completeness, that the empirical quantile of c11-math-0818 converges to the true quantile as c11-math-0819 and has an asymptotic normal behavior, from which we deduce the following confidence interval at probability a for the true quantile: c11-math-0820, where c11-math-0821 can be empirically estimated for such a large c11-math-0822. We do not compute them numerically: c11-math-0823 being very large, bounds are close.

We compute the values of the quantiles of order c11-math-0824, c11-math-0825 (c11-math-0826 indicating the chosen method), obtained by the main methods, the GCLT method, the Max one, Normex, and the weighted normal method, respectively. We do it for various values of c11-math-0827 and c11-math-0828. We compare them with the (empirical) quantile c11-math-0829 obtained via Pareto simulations (estimating the true quantile). For that, we introduce the approximative relative error:

equation

We consider three possible order c11-math-0831: c11-math-0832, c11-math-0833 (threshold for Basel II) and c11-math-0834 (threshold for Solvency 2).

We use the software R to perform this numerical study, with different available packages. Let us particularly mention the use of the procedure Vegas in the package R2Cuba for the computation of the double integrals. This procedure turns out not to be always very stable for the most extreme quantiles, mainly for low values of c11-math-0835. In practice, for the computation of integrals, we would advise to test various procedures in R2Cuba (Suave, Divonne, and Cuhre, besides Vegas) or to look for other packages. Another possibility would be implementing the algorithm using altogether a different software, as, for example, Python.

11.4.3.2 Estimation of the VaR with the various methods

All codes and results are obtained for various c11-math-0836 and c11-math-0837 are given in Kratz (2013) (available upon request) and will draw conclusions based on all the results.

We start with a first example when c11-math-0838 to illustrate our main focus, when looking at data under the presence of moderate heavy tail. We present here the case c11-math-0839 in Table 11.4.

Table 11.4 Approximations of extreme quantiles (95%; 99%; 99.5%) by various methods (CLT, Max, Normex, weighted normal) and associated approximative relative error to the empirical quantile c11-math-0840, for n = 52, 100, 250, 500 respectively, and c11-math-0841

c11-math-0842 c11-math-0843 c11-math-0844 c11-math-0845 c11-math-0846 c11-math-0847
c11-math-0848 c11-math-0849 c11-math-0850 c11-math-0851 c11-math-0852 c11-math-0853
c11-math-0854(%) c11-math-0855 c11-math-0856 c11-math-0857
95% 103.23 104.35 102.60 103.17 109.25
1.08 c11-math-08580.61 c11-math-08590.06 5.83
99% 119.08 111.67 117.25 119.11 118.57
c11-math-08606.22 c11-math-08611.54 0.03 c11-math-08620.43
99.5% 128.66 114.35 127.07 131.5 121.98
c11-math-086311.12 c11-math-08641.24 2.21 c11-math-08655.19
c11-math-0866 c11-math-0867 c11-math-0868 c11-math-0869 c11-math-0870 c11-math-0871
c11-math-0872 c11-math-0873 c11-math-0874 c11-math-0875 c11-math-0876 c11-math-0877
c11-math-0878(%) c11-math-0879 c11-math-0880 c11-math-0881
95% 189.98 191.19 187.37 189.84 197.25
0.63 c11-math-08821.38 c11-math-08830.07 3.83
99% 210.54 201.35 206.40 209.98 209.74
c11-math-08844.36 c11-math-08851.96 c11-math-08860.27 c11-math-08870.38
99.5% 222.73 205.06 219.14 223.77 214.31
c11-math-08887.93 c11-math-08891.61 0.47 c11-math-08903.78
c11-math-0891 c11-math-0892 c11-math-0893 c11-math-0894 c11-math-0895 c11-math-0896
c11-math-0897 c11-math-0898 c11-math-0899 c11-math-0900 c11-math-0901 c11-math-0902
c11-math-0903(%) c11-math-0904(%) c11-math-0905(%) c11-math-0906
95% 454.76 455.44 446.53 453.92 464.28
0.17 c11-math-09071.81 c11-math-09080.18 2.09
99% 484.48 471.5 473.99 483.27 483.83
c11-math-09092.68 c11-math-09102.17 c11-math-09110.25 c11-math-09120.13
99.5% 501.02 477.38 492.38 501.31 490.98
c11-math-09134.72 c11-math-09141.73 0.06 c11-math-09152.00
c11-math-0916 c11-math-0917 c11-math-0918 c11-math-0919 c11-math-0920 c11-math-0921
c11-math-0922 c11-math-0923 c11-math-0924 c11-math-0925 c11-math-0926 c11-math-0927
c11-math-0928(%) c11-math-0929(%) c11-math-0930(%) c11-math-0931
95% 888.00 888.16 872.74 886.07 900.26
0.02 c11-math-09321.72 c11-math-09330.22 1.38
99% 928.80 910.88 908.97 925.19 927.80
c11-math-09341.93 c11-math-09352.14 c11-math-09360.39 c11-math-09370.11
99.5% 950.90 919.19 933.23 948.31 937.89
c11-math-09383.33 c11-math-09391.86 c11-math-09400.27 c11-math-09411.37

Let us also illustrate in table 11.5 the heavy tail case, choosing, for instance, c11-math-0942, which means that c11-math-0943. Take, for example, c11-math-0944 respectively, to illustrate the fit of Normex even for small samples. Note that the weighted normal does not apply here since c11-math-0945.

Table 11.5 Approximations of extreme quantiles (95%; 99%; 99.5%) by various methods (GCLT, Max, Normex) and associated approximative relative error to the empirical quantile c11-math-0946, for c11-math-0947, respectively, and for c11-math-0948

c11-math-0949 c11-math-0950 c11-math-0951 c11-math-0952 c11-math-0953
c11-math-0954 c11-math-0955 c11-math-0956 c11-math-0957 c11-math-0958
c11-math-0959(%) c11-math-0960 c11-math-0961
95% 246.21 280.02 256.92 245.86
13.73 4.35 c11-math-09620.14
99% 450.74 481.30 455.15 453.92
6.78 0.97 0.71
99.5% 629.67 657.91 631.66 645.60
4.48 0.31 2.53
c11-math-0963 c11-math-0964 c11-math-0965 c11-math-0966 c11-math-0967
c11-math-0968 c11-math-0969 c11-math-0970 c11-math-0971 c11-math-0972
c11-math-0973(%) c11-math-0974 c11-math-0975
95% 442.41 491.79 456.06 443.08
11.16 3.09 0.15
99% 757.82 803.05 762.61 761.66
5.97 0.63 0.51
99.5% 1031.56 1076.18 1035.58 1032.15
4.33 0.39 0.06

11.4.3.3 Discussion of the results

  • Those numerical results are subject to numerical errors due to the finite sample of simulation of the theoretical value, as well as the choice of random generators, but the most important reason for numerical error of our methods resides in the convergence of the integration methods. Thus, one should read the results, even if reported with many significant digits, to a confidence we estimate to be around 0.1% (also for the empirical quantiles).
  • The Max method overestimates for c11-math-0976 and underestimates for c11-math-0977; it improves a bit for higher quantiles and c11-math-0978. It is a method that practitioners should think about when wanting to have a first idea on the range of the VaR, because it is very simple to use and costless in terms of computations. Then they should turn to Normex for an accurate estimation.
  • The GCLT method (c11-math-0979) overestimates the quantiles but improves with higher quantiles and when c11-math-0980 increases.
  • Concerning the CLT method (c11-math-0981), we find out that:
    • The higher the quantile, the higher the underestimation; it improves slightly when c11-math-0982 increases, as expected.
    • The smaller c11-math-0983, the larger the underestimation.
    • The estimation improves for smaller-order c11-math-0984 of the quantile and large c11-math-0985, as expected, since, with smaller order, we are less in the upper tail.
    • The VaR estimated with the normal approximation is almost always lower than the VaR estimated via Normex or the weighted normal method. The lower c11-math-0986 and c11-math-0987, the higher the difference with Normex.
    • The difference between the VaR estimated by the CLT and the one estimated with Normex appears large for relatively small c11-math-0988, with a relative error going up to 13%, and decreases when c11-math-0989 becomes larger.
  • With the Weighted Normal method (c11-math-0990), it appears that:
    • The method overestimates the 95% quantile but is quite good for the 99% one. In general, the estimation of the two upper quantiles improve considerably when compared with the ones obtained via a straightforward application of the CLT.
    • The estimation of the quantiles improve with increasing n and increasing c11-math-0991.
    • The results are generally not as sharp as the ones obtained via Normex, but better than the ones obtained with the max method, whenever c11-math-0992.
  • Concerning Normex, we find out that:
    • The accuracy of the results appears more or less independent of the sample size c11-math-0993, which is the major advantage of our method when dealing with the issue of aggregation.
    • For c11-math-0994, it always gives sharp results (error less than 0.5% and often extremely close); for most of them, the estimation is indiscernible from the empirical quantile, obviously better than the ones obtained with the other methods.
    • For c11-math-0995, the results for the most extreme quantile are slightly less satisfactory than expected. We attribute this to a numerical instability in the integration procedure used in R. Indeed, for very large quantiles (c11-math-0996%), the convergence of the integral seems a bit more unstable (due to the use of the package Vegas in R), which may explain why the accuracy decreases a bit and may sometimes be less than with the Max method. This issue should be settled using other R-package or software.
  • We have concentrated our study on the VaR risk measure because it is the one used in solvency regulations both for banks and insurances. However, the expected shortfall, which is the only coherent measure in presence of fat tails, would be more appropriate for measuring the risk of the companies. The difference between the risk measure estimated by the CLT and the one estimated with Normex would certainly be much larger than what we obtain with the VaR when the risk is measured with the expected shortfall, pleading for using this measure in presence of fat tails.

11.4.3.4 Normex in practice

From the results we obtained, Normex appears as the best method among the ones we studied, applicable for any c11-math-0997 and c11-math-0998. This comparison was done on simulated data. A next step will be to apply it on real data.

Let us sketch up a step-by-step procedure on how Normex might be used and interpreted in practice on real data when considering aggregated heavy-tailed risks.

We dispose of a sample c11-math-0999, with unknown heavy-tailed cdf having positive tail index c11-math-1000. We order the sample as c11-math-1001 and consider the aggregated risks c11-math-1002 that can be rewritten as c11-math-1003.

  1. Preliminary step: Estimation of c11-math-1004, with standard EVT methods (e.g., Hill estimator (Hill, 1975)), c11-math-1005-estimator (Kratz and Resnick, 1996); (Beirlant et al., 1996)), etc.); let c11-math-1006 be denoted as an estimate of c11-math-1007.
  2. Define c11-math-1008 with c11-math-1009(see (11.13)); the c11-math-1010 largest order statistics share the property of having a c11-math-1011th infinite moment, contrary to the c11-math-1012 first ones. Note that c11-math-1013 is independent of the aggregation size.
  3. The c11-math-1014 first order statistics and the c11-math-1015 last ones being, conditionally on c11-math-1016, independent, we apply the CLT to the sum of the c11-math-1017 first order statistics conditionally on c11-math-1018 and compute the distribution of the sum of the last c11-math-1019 ones conditionally on c11-math-1020 assuming a Pareto distribution for the r.v.'s because of (11.3).
  4. We can then approximate the cdf of c11-math-1021 by c11-math-1022 defined in Theorem 11.2, which provides a sharp approximation, easily computable whatever the size of the sample is.
  5. We deduce any quantile c11-math-1023 of order c11-math-1024 of c11-math-1025 as c11-math-1026, which allows an accurate evaluation of risk measures of aggregated heavy-tailed risks.

11.5 Conclusion

The main motivation of this study was to propose a sharp approximation of the entire distribution of aggregate risks when working on financial or insurance data under the presence of fat tails. It corresponds to one of the daily duties of actuaries when modeling investment or insurance portfolios. In particular the aim is to obtain the most accurate evaluations of risk measures. After reviewing the existing methods, we built two new methods, Normex and the weighted normal method. Normex is a method mixing a CLT and the exact distribution for a small number (defined according to the range of c11-math-1027 and the choice of the number of existing moments of order c11-math-1028) of the largest order statistics. The second approach is based on a weighted normal limit, with a shifted mean and a weighted variance, both expressed in terms of the tail distribution.

In this study, Normex has been proved, theoretically as well as numerically, to deliver a sharp approximation of the true distribution, for any sample size c11-math-1029 and for any positive tail index c11-math-1030, and is generally better than existing methods. The weighted normal method consists of trimming the total sum by taking away a large number of extremes and approximating the trimmed sum with a normal distribution and then shifting it by the (almost sure) limit of the average of the extremes and correcting the variance with a weight depending on the shape of the tail. It is a simple and reasonable tool, which allows to express explicitly the tail contribution to be added to the VaR when applying the CLT to the entire sample. It has been developed empirically in this work and still requires further analytical study. It constitutes a simple and exploratory tool to remediate the underestimation of extreme quantiles over 99% .

An advantage of both methods, Normex and the weighted normal, is their generality. Indeed, trimming the total sum by taking away extremes having infinite moments (of order c11-math-1031) is always possible and allows to better approximate the distribution of the trimmed sum with a normal one (via the CLT). Moreover, fitting a normal distribution for the mean behavior can apply, not only for the Pareto distribution but for any underlying distribution, without having to know about it, whereas for the extreme behavior, we pointed out that a Pareto type is standard in this context.

Normex could also be used from another point of view. We could apply it for a type of inverse problem to find out a range for the tail index c11-math-1032 when fitting this explicit mixed distribution to the empirical one. Note that this topic of tail index estimation has been studied extensively in the literature on the statistics of extremes (see, e.g., (Beirlant et al., 2004); (Reiss and Thomas, 2007), and references therein). Approaches to this estimation may be classified into two classes, supervised procedures in which the threshold to estimate the tail is chosen according to the problem (as, e.g., for seminal references, the weighted moments (Hosking and Wallis, 1987)), the MEP (Davison and Smith, 1990); (Hill, 1975), c11-math-1033 (Kratz and Resnick, 1996); (Beirlant et al., 1996)) methods, and unsupervised ones, where the threshold is algorithmically determined (as, e.g., in (Bengio and Carreau, 2009) and references therein, (Debbabi and Kratz, 2014)). Normex would then be classified as a new unsupervised approach, since the c11-math-1034 is chosen algorithmically for a range of c11-math-1035.

Other perspectives concern the application of this study to real data, its extension to the dependent case, using CLT under weak dependence and some recent results on stable limits for sums of dependent infinite variance r.v. from Bartkiewicz et al. (2012) and large deviation principles from (Mikosch and Wintenberger, 2013).

Finally this study may constitute a first step in understanding the behavior of VaR under aggregation and be helpful in analyzing the scaling behavior of VaR under aggregation, next important problem that we want to tackle.

References

  1. Acerbi, C., Tasche, D. On the coherence of expected shortfall. Journal of Banking & Finance 2002;26:1487–1503.
  2. Artzner, P., Delbaen, F., Eber, J.-M., Heath, D. Coherent measures of risks. Mathematical Finance 1999;9:203–228.
  3. Bartkiewicz, K., Jakubowski, A., Mikosch, T., Wintenberger, O. Stable limits for sums of dependent infinite variance random variables. Probab Theory Relat Fields 2012;150:337–372.
  4. Basel Committee on Banking Supervision. Developments in modelling risk aggregation. Basel: Bank for International Settlements; 2010.
  5. Beirlant, J., Goegebeur, Y., Segers, J., Teugels, J. Statistics of Extremes: Theory and Applications. New York: John Wiley & Sons, Inc.; 2004.
  6. Beirlant, J., Vynckier, P., Teugels, J. Tail index estimation, Pareto quantile plots, and regression diagnostics. J Am Stat Assoc 1996;91:1659–1667.
  7. Bellini, F., Klar, B., Müller, A., Rosazza Gianin, E. Generalized quantiles as risk measures; 2013. Preprint http://ssrn.com/abstract=2225751. Accessed 2016 May 2.
  8. Bengio, Y., Carreau, J. A hybrid Pareto model for asymmetric fat-tailed data: the univariate case. Extremes 2009;1:53–76.
  9. Cai, G.-H. The Berry-Esséen bound for identically distributed random variables by Stein method. Appl Math J Chin Univ 2012;27:455–461.
  10. Chen, L.H., Shao, Q.-M. Normal approximation under local dependence. Ann Probab 2004;32(3):1985–2028.
  11. Csörgö, S., Horváth, L., Mason, D. What portion of the sample makes a partial sum asymptotically stable or normal? Probab Theory Relat Fields 1986;72:1–16.
  12. Dacorogna, M.M., Gençay, R., Müller, U., Olsen, R., Pictet, O. An Introduction to High-Frequency Finance. New York: Academic Press; 2001.
  13. Dacorogna, M.M., Müller, U.A., Pictet, O.V., de Vries, C.G. The distribution of extremal foreign exchange rate returns in extremely large data sets. Extremes 2001;4(2):105–127.
  14. Daníelsson, J., Jorgensen, B., Samorodnitsky, G., Sarma, M., de Vries, C. Subadditivity re-examined: the case for Value-at-Risk. FMG Discussion Papers, London School of Economics; 2005.
  15. David, H.A., Nadaraja, H.N. Order Statistics. 3rd ed. New York: John Wiley & Sons, Inc.; 2003.
  16. Davison, A., Smith, R. Models for exceedances over high thresholds. J R Stat Soc Ser B 1990;52(3):393–442.
  17. Debbabi, N., Kratz, M. A new unsupervised threshold determination for hybrid models. IEEE-ICASSP; 2014.
  18. Embrechts, P., Klüppelberg, C., Mikosch, T. Modelling Extremal Events for Insurance and Finance. Berlin: Springer-Verlag; 1997.
  19. Embrechts, P., Lambrigger, D., Wüthrich, M. Multivariate extremes and the aggregation of dependent risks: examples and counter-examples. Extremes 2009;12:107–127.
  20. Embrechts, P., Puccetti, G., Rüschendorf, L. Model uncertainty and VaR aggregation. Journal of Banking & Finance 2013;37(8):2750–2764.
  21. Emmer, S., Kratz, M., Tasche, D. What is the best risk measure in practice? A comparison of standard risk measures. J Risk 2015;18:31-60.
  22. Esseen, C.-G. On the Liapunoff limit of error in the theory of probability. Ark Mat Astron Fysik 1942;A28:1–19. ISSN: 0365-4133.
  23. Feller, W. An Introduction to Probability Theory and its Applications. Volume II. New York: Wiley; 1966.
  24. Fisher, R.A., Tippett, L.H.C. Limiting forms of the frequency distribution of the largest or smallest member of a sample. Proc Camb Philos Soc 1928;24:180–190.
  25. Furrer, H. Uber die Konvergenz zentrierter und normierter Summen von Zufallsvariablen und ihre Auswirkungen auf die Risikomessung; 2012. ETH preprint.
  26. Galambos, J. The Asymptotic Theory of Extreme Order Statistics. New York: John Wiley & Sons, Inc.; 1978.
  27. Gneiting, T. Making and evaluating point forecasts. J Am Stat Assoc 2012;106(494):746–762.
  28. Hahn, M.G., Mason, D.M., Weiner, D.C. Sums, Trimmed Sums and Extremes. Progress in Probability. Volume 23. Cambridge (MA): Birkhäuser Boston; 1991.
  29. Hall, P. On the influence of extremes on the rate of convergence in the central limit theorem. Ann Probab 1984;12:154–172.
  30. Hill, B.M. A simple approach to inference about the tail of a distribution. Ann Stat 1975;3:1163–1174.
  31. Hosking, J., Wallis, J. Parameter and quantile estimation for the Generalized Pareto distribution. Technometrics 1987;29(3):339–349.
  32. Hosking, J.R.M., Wallis, J.R., Wood, E.F. Estimation of the generalized extreme-value distribution by the method of probability-weighted moments. Technometrics 1985;27:251–261. [321–323].
  33. Huston, J., McCulloch, M. Measuring tail thickness to estimate the stable index c11-math-1036: a critique. J. of Business & Economic Statistics 1997;15:74–81.
  34. Jansen, D.W., De Vries, C.G. On the frequency of large stock returns: putting booms and busts into perspectives. Review of Economics and Statistics 1991;73:18–24.
  35. Jenkinson, A.F. The frequency distribution of the annual maximum (or minimum) values of meteorological elements. Q J R Meteorol Soc 1955;81:58–171.
  36. Korolev, V.Y., Shevtsova, I.G. On the upper bound for the absolute constant in the Berry-Esseen inequality. J Theory Probab Appl 2010;54(4):638–658.
  37. Kratz, M. There is a VaR beyond usual approximations. Towards a toolkit to compute risk measures of aggregated heavy tailed risks. FINMA report; 2013.
  38. Kratz, M. Normex, a new method for evaluating the distribution of aggregated heavy tailed risks. Application to risk measures. Extremes 2014;17(4) (Special issue: Extremes and Finance. Guest Ed. P. Embrechts). 661–691.
  39. Kratz, M., Resnick, S. The QQ-estimator and heavy tails. Stoch Models 1996;12:699–724.
  40. Leadbetter, R., Lindgren, G., Rootzén, H. Extremes and Related Properties of Random Sequences and Processes. New York: Springer-Verlag; 1983.
  41. Longin, F. The asymptotic distribution of extreme stock market returns. J Bus 1996;63:383–408.
  42. Mikosch, T., Wintenberger, O. Precise large deviations for dependent regularly varying sequences. Probab Theory Relat Fields 2013;156:851–887.
  43. Mori, T. On the limit distribution of lightly trimmed sums. Math Proc Camb Philos Soc 1984;96:507–516.
  44. Müller, U.A., Dacorogna, M.M., Pictet, O.V. Heavy tails in high-frequency financial data. In: Taqqu, M., editor. Published in the book A Practical Guide to Heavy Tails: Statistical Techniques for Analysing Heavy Tailed Distributions. Boston (MA): Birkhauser; 1998.
  45. Nolan, J. 2012. Available at http://academic2.american.edu/jpnolan/stable/stable.html. Accessed 2016 May 2.
  46. Petrov, V.V. A local theorem for densities of sums of independent random variables. J Theory Probab Appl 1956;1(3):316–322.
  47. Petrov, V.V. Limit Theorem of Probability Theory: Sequences of Independent Random Variables. Oxford: Oxford Sciences Publications; 1995.
  48. Pickands, J. Statistical inference using extreme order statistics. Ann Stat 1975;3:119–131.
  49. Pictet, O., Dacorogna, M., Müller, U.A. Hill, Bootstrap and Jackknife Estimators for heavy tails, In: Taqqu, M., editor. Practical Guide for Heavy Tails Distributions. Boston (MA): Birkhäuser; 1998.
  50. Pinelis, I. On the nonuniform Berry-Esséen bound; 2013. arxiv.org/pdf/1301.2828.
  51. Ramsay, C.M. The distribution of sums of certain i.i.d. Pareto variates. Commun Stat Theory Methods 2006;35(3):395–405.
  52. Reiss, R.-D., Thomas, M. Statistical Analysis of Extreme Values: With Applications to Insurance, Finance, Hydrology and Other Fields. 3rd ed. Birkhäuser Basel; 2007.
  53. Resnick, S. Extreme Values, Regular Variation, and Point Processes. 1st ed. New York: Springer-Verlag; 1987, 2008.
  54. Resnick, S. Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. New York: Springer-Verlag; 2006.
  55. Samorodnitsky, G., Taqqu, M.S. Stable non-Gaussian Random Processes: Stochastic Models with Infinite Variance. New York: Chapman & Hall; 1994.
  56. Shevtsova, I.G. About the rate of convergence in the local limit theorem for densities under various moment conditions. Statistical Methods of Estimation and Hypotheses Testing Volume 20, Perm, Russia; 2007. p 1-26(in Russian).
  57. Shevtsova, I.G. On the absolute constants in the Berry-Esséen inequality and its structural and nonuniform improvements. Informatics and its Applications 2013;7(1):124–125 (in Russian).
  58. Stein, C. A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. Proceedings of the 6th Berkeley Symposium on Mathematical Statistics and Probability, Volume 2; 1972. p 583–602.
  59. Stein, C. Approximate Computations of Expectations. Volume 7, Lecture Notes—Monograph Series. Hayward (CA): IMS; 1986.
  60. Taylor, S.J. Modelling Financial Time Series. Chichester: John Wiley & Sons; 1986.
  61. Tyurin, I.S. An improvement of upper estimates of the constants in the Lyapunov theorem. Russian Mathematical Surveys 2010;65(3):201–202.
  62. von Mises, R. La distribution de la plus grande de n valeurs. Revue Mathématique de l'Union Interbalkanique 1936;1:141–160.
  63. Zaliapin, I.V., Kagan, Y.Y., Schoenberg, F.P. Approximating the distribution of Pareto sums. Pure Appl Geophys 2005;162:1187–1228.
  64. Ziegel, J.F. Coherence and elicitability. Mathematical Finance 2014 (online 2014).
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset