Chapter 8

Competing Risks Models

8.1 Introduction

Classical competing risks models will be discussed, as well as their modern interpretations. The basic problem is that we want to consider more than one type of event, but where exactly one will occur. For instance, consider different causes of death.

Figure 8.1

Figure showing competing risks: causes of death.

Competing risks: causes of death.

The first problem is: How can the intensities

( (α1(t),α2(t),α3(t)),t>0

be nonparametrically estimated? It turns out that this is quite simple. The trick is to take one cause at at time and estimate its intensity as if the rest of the causes (events) are censorings. The real problem starts when these intensities are turned into probabilities.

8.2 Some Mathematics

First we need some strict definitions, so that the correct questions are asked, and the correct answers are obtained.

The cause-specific cumulative hazard functions are

Γk(t)=0tαk(s)ds,t>0,k=1,2,3.

The total mortality is

λ(t)=k=13αk(t)t>0

Λ(t)=k=13Γk(t),t>0

and total survival is

S(t)=exp{ Λ(t) }

So far, this is not controversial. But asking for “cause-specific survivor functions” is.

8.3 Estimation

The quantities Γk, k = 1, 2, 3, Λ, and S can be estimated in the usual way. Γ1, the cumulative hazard function for cause 1 is estimated by regarding all other causes (2 and 3) as censorings. The total survivor function S is estimated by the method of Kaplan–Meier, regarding all causes as the same cause (just death).

Is it meaningful to estimate (calculate) Sk(t) = exp {-Γk(t)}, k = 1, 2, 3? The answer is “No.” The reason is that it is difficult to define what these probabilities mean in the presence of other causes of death. For instance, what would happen if one cause was eradicated?

8.4 Meaningful Probabilities

It is meaningful to estimate (and calculate)

Pk(t)=0tS(s)αk(s)ds,t>0,k=1,2,3,

the probability to die from cause k before time t. Note that

S(t)+P1(t)+P2(t)+P3(t)=1 for all  t>0.

Now, estimation is straightforward with the following estimators:

P^k(t)=i:titS^(ti)di(k)ni,k=1,2,3,

and S^(t), t > 0 is the ordinary Kaplan–Meier estimator for total mortality.

See the function comp in Section 8.6 for how to write code for this estimation.

8.5 Regression

It is also possible to include covariates in the estimating procedure.

Pk(t)=1exp{ Γk(t)exp(Xβ(k)) },t>0,k=1,2,3.

These equations are estimated separately. In R, this is done with the package cmprsk (Fine & Gray 1999, Gray 2011).

Example 29 Mortality and Emigration

People born in Skellefteå in northern Sweden around the year 1800 are followed over time from 15 years of age to outmigration or death. Right censoring occurs for those who are still present on January 1, 1870. At exit the cause is noted; see Figure 8.2.

Figure 8.2

Figure showing death and emigration, skellefteå, 1800–1870.

Death and emigration, Skellefteå, 1800–1870.

The data at hand looks like this:

> load(“zz.RData”)
> head(zz)
   Pnr  Mpnr  Fpnr birthdate enter exit event
1 801000758 775000541 771000493 1801.023 15 39.47241 2
2 801000759 758000309 753000242 1801.028 15 50.00000 0
3 801000760 770000458 765000452 1801.031 15 50.00000 0
4 801000763 777000494 775000404 1801.039 15 26.10868 2
5 801000768 780000572 774000453 1801.061 15 16.43292 2
6 801000770 774000442 776000421 1801.072 15 48.59237 1
 parity ab
1  3 leg
2  4 leg
3 10 leg
4  3 leg
5  2 leg
6  2 leg
> summary(zz[, -(1:3)])
  birthdate   enter  exit   event
 Min.  :1801  Min.  :15  Min.  :15.00  Min.  :0.0000
 1st Qu.:1835  1st Qu.:15  1st Qu.:20.47  1st Qu.:0.0000
 Median :1856  Median :15  Median :28.34  Median :0.0000
 Mean  :1852  Mean  :15  Mean  :31.82  Mean  :0.7138
 3rd Qu.:1872  3rd Qu.:15  3rd Qu.:44.82  3rd Qu.:2.0000
 Max.  :1886  Max.  :15  Max.  :50.00  Max.  :4.0000
 parity   ab
Min. : 1.000  leg :16423
1st Qu. : 2.000  illeg: 850
Median : 3.000
Mean : 3.916
3rd Qu. : 6.000
Max. :19.000

The variable event above is treated as a continuous variable, which is not very useful, so we make a table instead.

> table(zz$event)
 0 1 2 3 4
10551 2102 4097 59  464

The variable event is coded as shown in Table 8.1.

Table 8.1

Competing risks in Skellefteå

Code

Meaning

Frequency

0

Censored

10543

1

Dead

2 102

3

Moved to other parish in Sweden

4 097

4

Moved to other Scandinavian country

59

5

Moved outside the Scandinavian countries

464

Without covariates we get the picture shown in Figure 8.3.

> source(“comp.R”)
> comp(zz$enter, zz$exit, zz$event, start.age = 15)

Figure 8.3

Figure showing cause-specific exit probabilities; migration from skellefteå to various other places.

Cause-specific exit probabilities; migration from Skellefteå to various other places.

The code for the function comp is displayed at the end of this chapter. It is not necessary to understand before reading on. It may be useful as a template for analyzing competing risks. And with covariates and cmprsk; the syntax to get it done is a little bit nonstandard. Here, moving to other parish in Sweden is analysed (failcode = 2).

> library(cmprsk)
> xx <- model.matrix(˜ -1 + parity, zz)
> fit <- crr(zz$exit, zz$event, xx, failcode = 3)
> fit
convergence: TRUE
coefficients:
 parity
0.03927
standard errors:
[1] 0.04489
two-sided p-values:
parity
 0.38

Note that this is not Cox regression (but close!). The corresponding Cox regression is

> coxreg(Surv(enter, exit, event == 3) ˜ parity, data = zz)
Call:
coxreg(formula = Surv(enter, exit, event == 3) ˜ parity, data = zz)
Covariate   Mean  Coef Rel.Risk  S.E. Wald p
parity   3.890 0.045 1.046 0.048 0.355

Events    59
Total time at risk  290569
Max. log. likelihood -548.83
LR test statistic   0.83
Degreesoffreedom  1
Overall p-value  0.363362

8.6 R Code for Competing Risks

The code that produced Figure 8.3 is shown here.

> comp
function (enter, exit, event, start.age = 0)
{
 require(eha)
 n <- max(event)
 rs.tot <- risksets(Surv(enter, exit, event > 0.5))
 haz.tot <- rs.tot$n.events/rs.tot$size
 n.times <- length(haz.tot) + 1
 S <- numeric(n.times)
 S[1] <- 1
 for (i in 2:n.times) S[i] <- S[i - 1] * (1 - haz.tot[i - 1])
 haz <- matrix(0, nrow = n, ncol = length(haz.tot))
 P <- matrix(0, nrow = n, ncol = length(haz.tot) + 1)
 for (row in 1:n) {
   rs <- risksets(Surv(enter, exit, event == row))
   haz.row <- rs$n.events/rs$size
   tmp <- 0
   cols <- which(rs.tot$risktimes %in% rs$risktimes)
   haz[row, cols] <- haz.row
   P[row, 2:NCOL(P)] <- cumsum(S[1:(n.times - 1)] * haz[row,
  ])
}
 plot(c(start.age, rs.tot$risktimes), S, ylim = c(0, 1),
  xlim = c(start.age,
  max(rs.tot$risktimes)), xlab = “Age”, type = “s”,
  ylab = “Probability”)
 for (i in 1:n) lines(c(start.age, rs.tot$risktimes), P[i,
  ], lty = i + 1, type = “s”)
 abline(h = 0)
 legend(35, 0.95, lty = 1:(n + 1), legend = c(“Survival”,
   “Dead”, “Other parish”, “Scandinavia”,
   “Outside Scandinavia”))
 invisible(list(P = P, S = S))
}

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset