Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 9
Statistics of Extremes: Challenges and Opportunities

M. de Carvalho^*

Faculty of Mathematics, Pontificia Universidad Católica de Chile, Santiago, Chile

9.1 Introduction

My experience on discussing concepts of risk and statistics of extremes with practitioners started in 2009 while I was a visiting researcher at the Portuguese Central Bank (Banco de Portugal). At the beginning, colleague practitioners were intrigued about the methods I was applying; the questions were recurrent: “What is the difference between statistics of extremes and survival analysis (or duration analysis)?¹ And why don't you apply empirical estimators?” The short answer is that when modeling rare catastrophic events, we need to extrapolate beyond observed data—into the tails of a distribution—and standard inference methods often fail to deal with this properly. To see this, suppose that we observe a random sample of losses $c09-math-0001$ and that we estimate the survivor function $c09-math-0002$ , using the empirical survivor function, $c09-math-0003$ , for $c09-math-0004$ . Now, suppose that we want to assess what is the probability of observing a loss just $c09-math-0005$ larger than the maximum observed loss, $c09-math-0006$ . Obviously, the probability of that event turns out to be zero [ $c09-math-0007$ , for all $c09-math-0008$ ], thus illustrating that the empirical survivor function fails to be able to extrapolate into the right tail of the loss distribution. As put simply by Taleb (2012, p. 46), “the fool believes that the tallest mountain in the world will be equal to the tallest one he has observed.”

In this chapter, I resume some viewpoints that I shared with the discussion group “Future of Statistics of Extremes” at the ESSEC Conference on Extreme Events in Finance, which took place in Royaumont Abbey, France, on 15–17 December 2014

extreme-events-in-finance.essec.edu

and which originated the invitation by the editor for writing this chapter. My goal is on providing a personal view on some recent concepts and methods of statistics of extremes and to discuss challenges and opportunities that could lead to potential future developments. The scope is far from encyclopedic, and many other interesting perspectives are found all over this monograph.

In Section 9.2, I note that a bivariate extreme value distribution is an example of what I call here a measure-dependent measure, I briefly review kernel density estimators for the spectral density, and I discuss families of spectral measures. In Section 9.3, I argue that the spectral density ratio model (de Carvalho and Davison, 2014), the proportional tails model (Einmahl et al., 2015), and the exponential families for heavy-tailed data (Fithian and Wager, 2015) share similar construction principles; in addition, I discuss en passant a new nonparametric estimator for the so-called scedasis function, which is one of the main estimation targets on the proportional tails model. Comments on potential future developments are scattered across the chapter and a miscellanea of topics are included in Section 9.4.

Throughout this chapter I use the acronym EVD to denote extreme value distribution.

9.2 Statistics of Bivariate Extremes

9.2.1 The Bivariate EVD is a Measure-dependent Measure

Let $c09-math-0009$ be a probability measure on $c09-math-0010$ , and let $c09-math-0011$ be a parameter space. The family $c09-math-0012$ is a statistical model. Obviously not every statistical model is appropriate for modeling risk. As mentioned in Section 9.1, candidate statistical models should possess the ability to extrapolate into the tails of a distribution, beyond existing data.

See Coles (2001, Theorem 3.1.1). Here, $c09-math-0025$ and $c09-math-0026$ are location and scale parameters, while $c09-math-0027$ is a shape parameter that determines the rate decay of the tail: $c09-math-0028$ , light tail (Gumbel); $c09-math-0029$ , heavy tail (Fréchet); $c09-math-0030$ , short tail (Weibull). The generalized EVD ( $c09-math-0031$ in (9.1)) is a three-parameter family that plays an important role in statistics of univariate extremes.

In some cases we want to assess the risk of observing simultaneously large values of two random variables (say, two simultaneous large losses in a portfolio), and the mathematical basis for such modeling is that of statistics of bivariate extremes. In this context, “extremal dependence” is often interpreted as a synonym of risk. Moving from one dimension to two dimensions increases sharply the complexity of models for the extremes. The first challenge one faces when modeling bivariate extremes is that the estimation object of interest is infinite dimensional, whereas in the univariate case only three parameters $c09-math-0032$ are needed. The intuition is the following. When modeling bivariate extremes, apart from the marginal distributions, we are also interested in the extremal dependence structure of the data, and—as we shall see in Theorem 9.4—only an infinite-dimensional object is flexible enough to capture the “spectrum” of all possible types of dependence.

Let $c09-math-0033$ $c09-math-0034$ , where I assume that $c09-math-0035$ and $c09-math-0036$ are unit Fréchet $c09-math-0037$ marginally distributed, that is, $c09-math-0038$ , for $c09-math-0039$ . Similarly to the univariate case, the classical theory for characterizing the extremal behavior of bivariate extremes is based on block maxima, here given by the componentwise maxima $c09-math-0040$ ; note that the componentwise maxima $c09-math-0041$ needs not to be a sample point. Similarly to the univariate case, we focus on the standardized maxima, which for Fréchet marginals is given by the standardized componentwise maxima, that is, $c09-math-0042$ . Next, I define a special type of statistical model that plays a key role on bivariate extreme value modeling.

What are the relevant statistical models for statistics of bivariate extremes? Is there an extension of the generalized EVD for the bivariate setting? The following is a bivariate analogue to Theorem 9.1.

See Coles (2001, Theorem 8.1). Throughout I refer to $c09-math-0062$ as a bivariate EVD. Note the similarities between (9.1) and (9.3): both start with an “exp,” but for bivariate EVD $c09-math-0063$ , whereas for univariate EVD $c09-math-0064$ . To understand why $c09-math-0065$ needs to be an element of $c09-math-0066$ , let $c09-math-0067$ or $c09-math-0068$ in (9.3). Some further comments are in order. First, since (9.2) is the only constraint on $c09-math-0069$ , neither $c09-math-0070$ nor $c09-math-0071$ can have a finite parameterization. Second, a bivariate extreme value distribution $c09-math-0072$ is an example of a measure-dependent measure, as introduced in Definition 9.2.

A pseudo-polar transformation is useful for understanding the role of $c09-math-0073$ , which is the so-called spectral measure. Define $c09-math-0074$ , and denote $c09-math-0075$ and $c09-math-0076$ as the radius and pseudo-angle, respectively. If $c09-math-0077$ is relatively large, then $c09-math-0078$ ; if $c09-math-0079$ is relatively large, then $c09-math-0080$ . de Haan and Resnick (1977) have shown that $c09-math-0081$ , as $c09-math-0082$ . Thus, when the radius $c09-math-0083$ is large, the pseudo-angles $c09-math-0084$ are approximately distributed according to $c09-math-0085$ . Perfect (extremal) dependence corresponds to $c09-math-0086$ being degenerate at $c09-math-0087$ , whereas independence corresponds to $c09-math-0088$ being a binomial distribution function, with half of the mass in 0 and the other half in 1. The spectral probability measure $c09-math-0089$ determines the interactions between joint extremes and is thus an estimating target of interest; other functionals of the spectral measure are also often used, such as the spectral density $c09-math-0090$ or Pickands (1981) dependence function $c09-math-0091$ , for $c09-math-0092$ . The cases of extremal independence and extremal dependence, respectively, correspond to the bivariate EVDs, $c09-math-0093$ and $c09-math-0094$ , for $c09-math-0095$ .

9.2.2 Nonparametric Spectral Density Estimation

In practice, we have to deal with a statistical problem—lack of knowledge on $c09-math-0096$ —and an inference challenge—that is, obtaining estimates that obey the marginal moment constraints and define a density on the unit interval. Indeed, as posed by Coles (2001, p. 146) “it is not straightforward to constrain nonparametric estimators to satisfy functional constraints of the type” of Eq. (9.2). Inference should be conducted by using $c09-math-0097$ pseudo-angles $c09-math-0098$ , which are constructed from a sample of size $c09-math-0099$ , thresholding the pseudo-radius at a sufficiently high threshold $c09-math-0100$ . Kernel smoothing estimators for $c09-math-0101$ have been recently proposed by 2013 and are based on

9.4

Here $c09-math-0103$ denotes the beta density with shape parameters $c09-math-0104$ , and $c09-math-0105$ is a parameter responsible for the level of smoothing, which can be obtained through cross validation. Each beta density is centered around a pseudo-angle in the sense that $c09-math-0106$ , for $c09-math-0107$ . And how can we obtain the probability masses, $c09-math-0108$ ? There are at least two options. A simple one is to consider Euclidean likelihood methods (Owen, 2001, pp. 63–66), in which case the vector of probability masses $c09-math-0109$ solves

9.5

By the method of Lagrange multipliers, we obtain $c09-math-0111$ , where $c09-math-0112$ , and $c09-math-0113$ . This yields the following estimator, known as the smooth Euclidean likelihood spectral density:

9.6

Another option proposed by de Carvalho et al. (2013) is to consider a similar approach to that of Einmahl and Segers (2009), in which case the vector of probability masses $c09-math-0115$ solves the following empirical likelihood (Owen, 2001) problem:

9.7

Again by the method of Lagrange multipliers, the solution is $c09-math-0117$ , for $c09-math-0118$ , where $c09-math-0119$ is the Lagrange multiplier associated with the second equality constraint in (9.7), defined implicitly as the solution to the equation

This yields the following estimator, known as the smooth empirical likelihood spectral density:

9.8

One can readily construct smooth estimators for the corresponding spectral measures; the smooth Euclidean spectral measure and smooth empirical likelihood spectral measure are, respectively, given by

where $c09-math-0123$ is the regularized incomplete beta function, with $c09-math-0124$ . By construction both estimators, (9.6) and (9.8), obey the moment constraint, so that, for example,

9.9

Put differently, realizations of the random probability measures $c09-math-0126$ and $c09-math-0127$ are elements of $c09-math-0128$ . Examples of applications of these estimators in finance can be found in Kiriliouk et al. (2015, Figure 4). At the moment, the large sample properties of these estimators remain unknown.

Other estimators for the spectral measure (obeying (9.2)) can be found in Boldi and Davison (2007), Guillotte et al. (2011), and Sabourin and Naveau (2014).

9.2.3 Predictor-Dependent Spectral Measures

Formally, $c09-math-0129$ is a set of predictor-dependent (henceforth pd) probability measures if the $c09-math-0130$ are probability measures on $c09-math-0131$ , indexed by a covariate $c09-math-0132$ ; here $c09-math-0133$ is the Borel sigma-algebra on $c09-math-0134$ . Analogously, I define the following:

And why do we care about pd spectral measures? Pd spectral measures allow us to assess how extremal dependence evolves over a certain covariate $c09-math-0138$ , that is, they allow us to model nonstationary extremal dependence structures. Pd spectral measures are a natural probabilistic concept for modeling extremal dependence structures that may change according to a covariate. Indeed, in many settings of applied interest, it seems natural to regard risk from a covariate-adjusted viewpoint, and this leads us to ideas of “conditional risk.” However, if we want to develop ideas of “conditional risk” for bivariate extremes, that is, if we want to assess systematic variation of risk according to a covariate, we need to allow for nonstationary extremal dependence structures.

To describe how extremal dependence may change over a predictor, I now introduce the concept of spectral surface.

A simple spectral surface can be constructed with the pd spectral density $c09-math-0143$ , where $c09-math-0144$ . In Figure 9.1, I represent a spectral surface based on this model, with $c09-math-0145$ , for $c09-math-0146$ . (Larger values of the predictor $c09-math-0147$ lead to larger levels of extremal dependence.) Other spectral surfaces can be readily constructed from parametric models for the spectral density; see, for instance, Coles (2001, Section 8.2.1).

c09f001 — **Figure 9.1** (a) Example of a spectral density. (b) Spectral surface from a predictor-dependent beta family, with $c09-math-0148$ , for $c09-math-0149$ .

Let's now regard the subject of pd bivariate extremes from another viewpoint. Modeling nonstationarity in marginal distributions has been the focus of much recent literature in applied extreme value modeling; see for instance Coles (2001, Chapter 6). The simplest approach in this setting was popularized long ago by Davison and Smith (1990), and it is based on indexing the location and scale parameters of the generalized EVD by a predictor, say, by taking

9.10

And how to model “nonstationary bivariate extremes” if one must? Surprisingly, by comparison to the marginal case, approaches to modeling nonstationarity in the extremal dependence structure have received relatively little attention. These should be important to assess the dynamics governing extremal dependence of variables of interest. For example, has extremal dependence between returns of CAC 40 and DAX 30 been constant over time, or has this level been changing over the years?

By using pd spectral measures, we are essentially indexing the parameter of the bivariate extreme value distribution ( $c09-math-0151$ ) with a covariate, and thus the approach can be regarded as an analogue of the Davison–Smith paradigm in (9.10), but for the bivariate setting. In the same way that (9.10) is a covariate-adjusted version of the generalized EVD (9.1), the following concept can be regarded as a pd version of the bivariate EVD in (9.3).

Similarly to Section 2.2, n practice we need to obtain estimates that obey the marginal moment constraint and define a density on the unit interval, for all $c09-math-0155$ . It is not straightforward to construct nonparametric estimators able to yield valid pd spectral measures. Indeed, any such estimator, $c09-math-0156$ , needs to obey the moment constraint, that is, $c09-math-0157$ , for all $c09-math-0158$ . Castro and de Carvalho (2016) and Castro et al. (2015) are currently developing models for these contexts, but there are still plenty of opportunities here.³

Needless to say that other pd objects of interest can be readily constructed. For example, a pd version of Pickands (1981) dependence function can be defined as $c09-math-0160$ , and a pd $c09-math-0161$ can also be constructed. Using the fact that $c09-math-0162$ (de Carvalho and Ramos, 2012, p. 91), the pd $c09-math-0163$ can be defined as $c09-math-0164$ , for $c09-math-0165$ .

9.2.4 Other Families of Spectral Measures

Beyond pd spectral measures other families of spectral measures are of interest. In a recent paper, de Carvalho and Davison (2014) proposed a model for a family of spectral measures $c09-math-0166$ . The applied motivation for the concept was to track the effect of explanatory variables on joint extremes. Put differently, their main concern was on the joint modeling of extremal events when data are gathered from several populations, to each of which corresponds a vector of covariates. Thus, conceptually, there are already in de Carvalho and Davison (2014) some of the ingredients of pd spectral measures and related modeling objectives. Each element in the family should be regarded as a “distorted version,” of a baseline spectral measure $c09-math-0167$ , in a sense that I will precise in the succeeding text. Formally, spectral density ratio families are defined as follows.

From (9.11), we can write all the normalization and moment constraints for this family as a function of the baseline spectral measure and the tilting parameters, that is,

9.12

Inference is based on the combined sample $c09-math-0185$ from the spectral distributions $c09-math-0186$ . Details on estimation and inference through empirical likelihood methods can be found in de Carvalho and Davison (2011), de Carvalho and Davison (2014). An extremely appealing feature of their model is that it allows for borrowing strength across samples, in the sense that the estimate of $c09-math-0187$ is based on $c09-math-0188$ pseudo-angles, instead of simply $c09-math-0189$ . Although flexible, their approach requires however a substantial computational investment; in particular, inference entails intensive constrained optimization problems—even for a moderate $c09-math-0190$ —so that estimates of $c09-math-0191$ obey empirical versions of the normalization and moment constraints in (9.12). Their approach allows for modeling extremal dependence in settings such as Figure 9.2a, but it excludes data configurations such as Figure 9.2b. The pd-based approach of Castro et al. (2015) allows for inference to be conducted in both settings in Figure 9.2.

c09f002 — **Figure 9.2** Scatter plots presenting two configurations of data (predictor, pseudo-angles): one (a) where there are sample pseudo-angles per each observed covariate and another (b) where to each observed covariate may correspond a single pseudo-angle.

9.3 Models Based on Families of Tilted Measures

The main goal of this section is on describing the link between the specifications underlying the spectral density ratio model, discussed in Section 2.4, the proportional tails model (Einmahl et al., 2015), and the exponential families for heavy-tailed data (Fithian and Wager, 2015).

9.3.1 Proportional Tails Model

The proportional tails model is essentially an approach for modeling nonstationary extremes. Suppose that at time points $c09-math-0192$ we gather independent observations $c09-math-0193$ , respectively, sampled from the continuous distribution functions $c09-math-0194$ , all with a common right end point $c09-math-0195$ . Suppose further that there exists a (time-invariant) baseline distribution function $c09-math-0196$ , also with right end point $c09-math-0197$ , and a continuous function $c09-math-0198$ , such that

9.13

Here $c09-math-0200$ is the so-called scedasis density, and following Einmahl et al. (2015) I assume the following normalization constraint $c09-math-0201$ . Equation (9.13) is the key specification of the proportional tails model. Roughly speaking, the scedasis density tells us how much more/less mass there is on the tail $c09-math-0202$ , relatively to the baseline tail, $c09-math-0203$ , for a large $c09-math-0204$ ; uniform scedasis corresponds to a constant frequency of extremes over time.

The question arises naturally: “If the scedasis density provides an indication of the ‘relative frequency’ of extremes over time, would it seem natural that such function could be somehow connected to the intensity measure of the point process characterization of univariate extremes (Coles 2001, Section 7.3)?” To have an idea on how the concepts relate, I sketch here a heuristic argument. I insist, the argument is heuristic, and my aim here does not go beyond shedding some light on how these ideas connect. Consider the following artificial setting. Suppose that we could gather a large sample from $c09-math-0205$ , say, $c09-math-0206$ , and that at each time point we could also collect a large sample from $c09-math-0207$ , say, $c09-math-0208$ , for $c09-math-0209$ . For concreteness let's focus on $c09-math-0210$ . Then, the definition of scedasis in (9.13) and similar arguments as in Coles (2001, Section 4.2.2) suggest that for a sufficiently large $c09-math-0211$ ,

9.14

where $c09-math-0213$ , for $c09-math-0214$ , is the intensity measure of the limiting Poisson process for univariate extremes (cf Coles 2001, Theorem 7.1.1). Thus, it can be seen from (9.14) that in this artificial setting the scedasis density can be literally interpreted as a measure of the relative intensity of the extremes at period $c09-math-0215$ , with respect to a (time-invariant) baseline.

Another important question is: “How can we estimate the scedasis density?” Einmahl et al. (2015) propose a kernel-based estimator

9.15

where $c09-math-0217$ , with $c09-math-0218$ being a bandwidth and $c09-math-0219$ being a kernel; in addition, $c09-math-0220$ are the order statistics of $c09-math-0221$ . Specifically, Einmahl et al. (2015) recommend $c09-math-0222$ to be a symmetric kernel on $c09-math-0223$ . A conceptual problem with using a kernel on $c09-math-0224$ is that it allows for the scedasis density to put mass outside $c09-math-0225$ .⁴ Using similar ideas to the ones involved in the construction of the smooth spectral density estimators in Section 2.2, I propose here the following estimator:

9.16

Indeed, each beta density is centered close to $c09-math-0227$ in the sense that $c09-math-0228$ , for $c09-math-0229$ , where $c09-math-0230$ is the parameter controlling the level of smoothing. My goal here will not be on trying to recommend an estimator over the other, but rather on providing a brief description of strengths and limitations with both approaches. In Figure 9.3, I illustrate how the two estimators, (9.15) and (9.16), perform on the same data used by Einmahl et al. (2015) and on simulated data (single-run experiment).⁵ The data consist of daily negative returns of the Standard and Poor's index from 1988 to 2007 ( $c09-math-0231$ ), and I use the same value for $c09-math-0232$ $c09-math-0233$ and the same bandwidth ( $c09-math-0234$ ) and (biweight) kernel [ $c09-math-0235$ , for $c09-math-0236$ ] as authors; I also follow the author's settings for the simulated data. Finally, I consider $c09-math-0237$ for illustration.

c09f003 — **Figure 9.3** Scedasis density estimates. The solid line represents the beta kernel estimate from (9.16), whereas the dashed line represents the estimate from (9.15). (a) Daily Standard and Poor's index from 1988 to 2007; the gray rectangles correspond to contraction periods in the US economy. (b) Simulated data illustration from $c09-math-0238$ , for $c09-math-0239$ , with $c09-math-0240$ and $c09-math-0241$ ; the grey line represents the true scedasis $c09-math-0242$ , for $c09-math-0243$ , and $c09-math-0244$ , for $c09-math-0245$ .

**Figure 9.3** Scedasis density estimates. The solid line represents the beta kernel estimate from (9.16), whereas the dashed line represents the estimate from (9.15). (a) Daily Standard and Poor's index from 1988 to 2007; the gray rectangles correspond to contraction periods in the US economy. (b) Simulated data illustration from $c09-math-0238$ , for $c09-math-0239$ , with $c09-math-0240$ and $c09-math-0241$ ; the grey line represents the true scedasis $c09-math-0242$ , for $c09-math-0243$ , and $c09-math-0244$ , for $c09-math-0245$ .

In the Standard and Poor's example in Figure 9.3a, it can be seen that both estimators capture similar dynamics; the gray rectangles represent contraction periods of the US economy as dated by the National Bureau of Economic Research (NBER). It is interesting to observe that the local maxima of the scedasis density are relatively close to economic contraction periods. Indeed, “turning points” (local maxima and minima) of the scedasis density seem like an interesting estimation target for many settings of applied interest.

The estimator in (9.16) has the appealing feature of putting all mass of the scedasis density inside the $c09-math-0246$ interval, and some further numerical experiments suggest that it tends to have a similar behavior to that in (9.15) except at the boundary. However, a shortcoming with the method in (9.16) is that it may not be defined at the vertex 0 or 1, and hence it could be inappropriate for forecasting purposes.

The proportional tails model is extremely appealing and simple to fit. A possible shortcoming is that it does not allow for $c09-math-0247$ to change over time. For applications in which we suspect that $c09-math-0248$ may change over time, the generalized additive approach by Chavez-Demoulin and Davison (2005) is a sensible alternative, and although the model is more challenging to implement, it can be readily fitted with the R package QRM by typing in the command game.

A problem that seems relevant for practice is that of cluster analysis for the proportional tails model. To see this, suppose that one estimates the scedasis density and tail index for several stocks. It seems natural to wonder: “How can we cluster stocks whose scedasis looks more alike, or—perhaps more interestingly—how can we cluster stocks with a similar scedasis and tail index?”

Lastly, I would like to comment that it seems conceivable that Bernstein polynomials could be used for scedasis density estimation. In particular, a natural question is “Would it be possible to construct a prior over the space of all integrated scedasis functions?” Random Bernstein polynomials could seem like the way to go; see Petrone (1999) and references therein.

9.3.2 Exponential Families for Heavy-Tailed Data

In this section I sketch some basic ideas on exponential families for heavy-tailed data; I will be more brief here than in Section 3.1. My goal is mainly on introducing the model specification and to move on; further details can be found in Fithian and Wager (2015).

The starting point for the Fithian–Wager approach is on modeling the conditional right tail law from a population, $c09-math-0249$ , as an exponential family with carrier measure $c09-math-0250$ , for a sufficiently large threshold $c09-math-0251$ . Two random samples are assumed to be available, $c09-math-0252$ and $c09-math-0253$ , with $c09-math-0254$ ; hence, the applied setting of interest is one where the size of the sample from $c09-math-0255$ is much larger than the one from $c09-math-0256$ . The model specification is

9.17

where the sufficient statistic $c09-math-0258$ is of the form $c09-math-0259$ for a certain $c09-math-0260$ ; the functional form of $c09-math-0261$ is motivated from the case where $c09-math-0262$ and $c09-math-0263$ are generalized Pareto distributions (cf. Fithian and Wager, 2015, p. 487).

In common with the spectral density ratio model, the Fithian–Wager model is motivated by the gains from borrowing strength across samples. Fithian and Wager are not however concerned about spectral measures, but rather on estimating a (small-sample) mean of a heavy-tailed distribution, by borrowing information from a much larger sample from a related population with the same $c09-math-0264$ . More concretely, the authors propose a semiparametric method for estimating the mean of $c09-math-0265$ , by using the decomposition $c09-math-0266$ , where $c09-math-0267$ , $c09-math-0268$ , $c09-math-0269$ , and $c09-math-0270$ . The Fithian–Wager estimator for the (small-sample) mean can be written as

9.18

where $c09-math-0272$ and $c09-math-0273$ , for a large threshold $c09-math-0274$ . Here $c09-math-0275$ can be computed through a logistic regression with an intercept and predictor $c09-math-0276$ , as a consequence of results on imbalanced logistic regression (Owen, 2007). As it can be observed from (9.18), the main trick on the estimation of $c09-math-0277$ is on the exponential tilt-based estimator for the mean residual lifetime $c09-math-0278$ .

9.3.3 Families of Tilted Measures

From previous sections it may have became obvious that the common link underlying the specification of the spectral density ratio model, the proportional tails model, and the exponential families for heavy-tailed data was the assumption that all members in a family of interest were obtained through a suitable “distortion” of a certain baseline measure. In this section I make this link more precise.

Some examples are presented in the succeeding text.

9.4 Miscellanea

9.4.1 Asymptotic (in)dependence

Here, I comment on the need for further developing models compatible with both asymptotic dependence and asymptotic independence. In two influential papers, Poon et al. (2003, 2004) put forward that asymptotic independence was observed on many pairs of stock market returns. This had important consequences in finance, mostly because inferences in a seminal paper (Longin and Solnik, 2001) had been based on the assumption of asymptotic dependence, and hence perhaps risk had been overestimated earlier. However, an important questions is: “What if pairs of financial losses can move over time from asymptotic independence to asymptotic dependence and the other way around?” Some markets are believed to be more integrated these days than in the past, so for such markets it is relevant to ask whether they could have entered an “asymptotic dependence regime.” An accurate answer to this question would require however models able to allow for smooth transitions from asymptotic independence to asymptotic dependence, and vice versa, but as already mentioned in Section 2.3 at the moment, there is a shortage of models for nonstationary extremal dependence structures. Wadsworth et al. (2016) presents an interesting approach for modeling asymptotic (in)dependence.

9.4.2 Spatial Multivariate Extremes

An important reference here is Genton et al. (2015), but there is a wealth of problems to work in this direction, so I stop my comment here.

9.4.3 Dimension Reduction for Multivariate Extremes

Is there a way to reduce dimension in such a way that the interesting features of the data—in terms of tails of multivariate distributions—are preserved?⁶ I think it is fair to say that, apart from some remarkable exceptions, most models for multivariate extremes have been applied only to low-dimensional settings. I remember that at a seminal workshop on high-dimensional extremes, organized by Anthony Davison, at the Ecole Polytechnique Fédérale de Lausanne (September 14–18, 2009), for most talks high dimensional actually meant “two dimensional,” and all speakers were top scientists in the field.

Principal component analysis (PCA) itself would seem inappropriate, since principal axes are constructed in a way to find the directions that account for most variation, and for our axes of interest (whatever they are …), variation does not seem to be the most reasonable objective. A naive approach could be to use PCA for compositional data (Jolliffe, 2002, Section 13.3) and apply it to the pseudo-angles themselves. Such approach could perhaps provide a simple way to disentangle dependence into components that could be of practical interest.

9.4.4 Should the Journal Extremes Include an Applications and Case Studies Section?

Theory and methods are the backbone of our field, without regular variation we wouldn't have gone far anyway. But, beyond theory, should our community be investing even more than it already is, in modeling and applications? As put simply by Box (1979), “all models are wrong, but some are useful.” However, while most of us agree that models only provide an approximation to reality, we seem to be very demanding about the way that we develop theory about such—wrong yet useful—models. Some models entail ingenious approximations to reality and yet are very successful in practice. Should we venture more on this direction in the future? Applied work can also motivate new, and useful, theories. Should we venture more on collaborating with researchers from other fields or on creating more conferences such as the ESSEC Conference on Extreme Events in Finance, where one has the opportunity to regard risk and extremes from a broader perspective, so to think out of the box? Should the journal Extremes include an Applications and Case Studies section?

9.4.5 Communicating Risk and Extremes

What has our community been supplying in terms of communication of risk and extremes? Silence, for the most part. Definitely there have been some noteworthy initiatives, but perhaps mostly from people outside of our field such as those of David Spiegelhalter and David Hand. My own view is that it would be excellent if, in a recent future, leading scientists in our field could be more involved in communicating risk and extremes to the general public, either by writing newspaper and magazine articles or by promoting science vulgarization. Our community is becoming more and more aware of this need, I think. I was happy to see Paul Embrechts showing recently his concern about this matter at EVA 2015 in Ann Arbor.

9.4.6 Prior Elicitation in Contexts Where a Conflict of Interest Exists

How can we accurately elicit prior information when modeling extreme events in finance, in cases where a conflict of interest may exist? Suppose that a regulator requires a bank to report an estimate. If prior information is gathered from a bank expert—and if the bank is better off by misreporting—then how can we trust in the accuracy of the inferences? In such cases, I think the only Bayesian analysis a regulator should be willing to accept would be an objective Bayes-based analysis; see Berger (2006) for a review on objective Bayes.

References

Barrientos, A.F., Jara, A., Quintana, F.A. Fully nonparametric regression for bounded data using dependent Bernstein polynomials. J Am Stat Assoc 2016. In press.
Berger, J. The case for objective bayesian analysis. Bayesian Anal 2006;1:385–402.
Boldi, M.-O., Davison, A.C. A mixture model for multivariate extremes. Journal of the Royal Statistical Society, Series B 2007;69:217–229.
Box, G.E.P. Some problems of statistics and everyday life. Journal of the American Statistical Association 1979;74:1–4.
Castro, D., de Carvalho, M. Spectral density regression for bivariate extremes. Stochastic Environmental Research and Risk Assessment 2016. In press. DOI: 0.1007/s00477-016-1257-z.
Castro, D., de Carvalho, M., Wadsworth, J. Time-varying extremal dependence with application to leading european stock markets. 2015. Submitted.
Chautru, E. Dimension reduction in multivariate extreme value analysis. Electronic Journal of Statistics 2015;9:383–418.
Chavez-Demoulin, V., Davison, A.C. Generalized additive modelling of sample extremes. Journal of the Royal Statistical Society, Ser. C, 2005;54:207–222.
Chen, S.X. Beta Kernel estimators for density functions. Comput Stat Data Anal 1999;31:131–145.
Coles, S. An Introduction to Statistical Modeling of Extreme Values. New York: Springer-Verlag; 2001.
Davison, A.C., Smith, R.L. Models for exceedances over high thresholds (with Discussion). Journal of the Royal Statistical Society, Ser. B 1990;52:393–442.
de Carvalho, M., Davison, A.C. Semiparametric estimation for $c09-math-0310$ -sample multivariate extremes. Proceedings 58th World Statistical Congress; 2011. 2961–2969.
de Carvalho, M., Davison, A.C. Spectral density ratio models for multivariate extremes. J Am Stat Assoc 2014;109:764–776.
de Carvalho, M., Oumow, B., Segers, J., Warchoł, M. A Euclidean likelihood estimator for bivariate tail dependence. Communications in Statistics—Theory and Methods 2013;42:1176–1192.
de Carvalho, M., Ramos, A. Bivariate extreme statistics, II. RevStat—Statistical Journal 2012;10:81–104.
de Haan, L., Resnick, S.I. Limit theory for multivariate sample extremes. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 1977;40:317–377.
Einmahl, J.H.J., de Haan, L., Zhou, C. Statistics of heteroscedastic extremes. Journal of the Royal Statistical Society, Ser. B 2015;78(1):31–51. DOI: 10.1111/rssb.12099.
Einmahl, J.H.J., Segers, J. Maximum empirical likelihood estimation of the spectral measure of an extreme-value distribution. The Annals of Statistics 2009;37:2953–2989.
Fithian, W., Wager, S. Semiparametric exponential families for heavy-tailed data. Biometrika 2015;102:486–493.
Geenens, G. Probit transformation for kernel density estimation on the unit interval. Journal of the American Statistical Association, 2014;109:346–359.
Genton, M.G., Padoan, S.A., Sang, H. Multivariate max-stable spatial processes. Biometrika 2015;102:215–230.
Guillotte, S., Perron, F., Segers, J. Non-parametric bayesian inference on bivariate extremes. Journal of the Royal Statistical Society, Ser. B, 2011;73:377–406.
Jolliffe, I.T. Principal Component Analysis. New York: Springer-Verlag; 2002.
Jones, M.C., Henderson, D.A. Kernel-type density estimation on the unit interval. Biometrika 2007;94:977–984.
Kiriliouk, A., Segers, J., Warchoł, M. Nonparametric estimation of extremal dependence. In: Dey, D.K., Yan, J., editors. Extreme Value Modelling and Risk Analysis: Methods and Applications. Boca Raton (FL): Chapman and Hall/CRC; 2015.
Longin, F., Solnik, B. Extreme correlation of international equity markets. The Journal of Finance 2001;56:649–676.
Owen, A.B. Empirical Likelihood. Boca Raton (FL): Chapman and Hall/CRC; 2001.
Owen, A.B. Infinitely imbalanced logistic regression. Journal of Machine Learning Research 2007;8:761–773.
Petrone, S. Random Bernstein polynomials. Scand inavian Journal of Statistics 1999;26:373–393.
Pickands, J. Multivariate extreme value distributions. Proceedings 43rd World Statistical Congress; 1981. p 859–878.
Poon, S.-H., Rockinger, M., Tawn, J. Modelling extreme-value dependence in international stock markets. Statistica Sinica 2003;13:929–953.
Poon, S.-H., Rockinger, M., Tawn, J. Extreme value dependence in financial markets: diagnostics, models, and financial implications. Review of Financial Studies 2004;17:581–610.
Sabourin, A., Naveau, P. Bayesian dirichlet mixture model for multivariate extremes: a re-parametrization. Comput Stat Data Anal 2014;71:542–567.
Taleb, N. Antifragile: Things that Gain from Disorder. New York: Random House; 2012.
Wadsworth, J.L., Tawn, J.A., Davison, A.C., Elton, D.M. Modelling across extremal dependence classes. J R Stat Soc, Ser B 2016. In press.
Wooldridge, J.M. Econometric Analysis of Cross Section and Panel Data. 2nd ed. Cambridge (MA): MIT Press; 2015.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.