Appendix G
Survival Analysis – An Introduction
(Sources: Greene [1], Cameron and Trivedi [2], and Le [3])
Suppose that the random variable has a continuous probability distribution , where is a realization of . The cumulative probability is
(G.1)
We will usually be more interested in the probability that the spell is of length at least , which is given by the survival function
(G.2)
Consider the question: ‘Given that the spell has lasted until time , what is the probability that it will end in the next short interval of time, say ?’ This is
(G.3)
A useful function for characterizing this aspect of the distribution is the hazard rate,
(G.4)
The hazard rate is the rate at which spells are completed after duration , given that they last at least until . The hazard function can also be expressed in the form of a survival function, the density or distribution function:
(G.5)
and
(G.6)
The hazard specifies the distribution of T. In particular, integrating and using shows that
(G.7)
A final related function is the cumulative hazard function or integrated hazard function
(G.8)
Estimation of the survival function can be done by maximum likelihood, the procedure provided in Appendix. Here, we provide an example of the estimation of the exponential distribution from Le [3].
Suppose that we may be able to assume a parametric model for survival times, for example, an exponential model with density . Given a random sample of survival times
(G.9)
and a density function, denoted , the likelihood function of is
(G.10)
This leads to
(G.11)
The first-order derivative gives
(G.12)
From the second-order derivative, we get
(G.13)
and the standard error (SE) of is given by
(G.14)