3.4 Duration/Time

In contractual settings, the time customers spent with a company is critical to the maintenance and development of a business. Lewis [4] argued that customer acquisition promotions influence the duration of customer–company relationships. In the newspaper subscription industry, the author used survival analysis to model the relationship between discount depth and the length of time as a subscriber. The author adopted accelerated failure time models (Kalbfleish and Prentice, 1980) estimated as shown in the following specification:

(3.22) equation

where img is a random disturbance term, img and img are the discount and discount squared, and img and img are parameters to be estimated. The author varied the restrictions on img and img and specified various forms of the hazard function. The author used the exponential model as a baseline model, which assumes that the error term img has a standard extreme-value distribution and that img is equal to 1. The author also used two generalized gamma models: one included only the discount as an independent variable, and the other included also the quadratic form of discount.

Kalbfleish and Prentice (1980) gave a brief introduction of the accelerated failure time model as below. Suppose that img is related to the covariate img via a linear model img where img is an error variable with density img. Exponentiation gives img where img has hazard function img, say, that is independent of img. It follows that the hazard function for img can be written in terms of this baseline hazard img according to

(3.23) equation

The survival function is

(3.24) equation

and the density function is the product of Equations 3.15 and 3.16. To estimate the accelerated failure time model, the MLE method is usually used. Fanses and Paap (2001) introduced the estimation procedure of the accelerated failure time model for the Weibull distribution. We provide the reference in Appendix D.

Schweidel et al. [12] argued that time until acquisition, which is defined as the time that elapses before a prospective customer acquires a particular service, influences the duration that customers stay with the company. Using data from telecommunication services, the authors used a customer-level bivariate timing model for the time until acquisition of customers who eventually acquired service in the observation period. The authors assumed a parametric distribution for the probability of acquiring service at time img:

(3.25) equation

where img is the survival function of the parametric distribution, img, and img is the length of the observation period. In the empirical studies, the authors considered three sets of possible baseline hazard specifications for the acquisition processes: the Weibull, log-logistic and expo-power distribution. The hazard function and survival function of these three baseline distributions are listed in Table 3.4. We provide an introduction to survival analysis in Appendix G.

Table 3.4 Hazard function and survival function of three baseline distributions.

Source: Schweidel, Fader, and Bradlow (2008).

Baseline distribution Hazard function Survival function
Weibull img img
Log-logistic img img
Expo-power img img

3.4.1 Empirical Example: Duration/Time

Customer acquisition is of key importance to many firms. However, many firms also want to look beyond just the point of acquisition and answer questions such as, ‘How long will a newly acquired customer still be a customer?’ So in this example we will try to uncover the drivers of new customer duration. We expect that customer duration will be a function of the customer's exchange characteristics and firmographic information. But, we also want to see if the acquisition effort by the firm affects each new customer's duration with the firm. At the end of this example we should be able to:

1. Determine the drivers of new customer duration.
2. Predict the duration of each new customer.

The information we need for this model includes the following list of variables:

Dependent variables
Duration The time in days that the acquired prospect has been or was a customer, right-censored at 730 d
Independent variables
Acq_Expense Dollars spent on marketing efforts to try and acquire that prospect
Acq_Expense_SQ Square of dollars spent on marketing efforts to try and acquire that prospect
Ret_Expense Dollars spent on marketing efforts to try and retain that customer
Ret_Expense_SQ Square of dollars spent on marketing efforts to try and retain that customer
Crossbuy The number of categories the customer has purchased
Frequency The number of times the customer purchased during the observation window
Frequency_SQ The square of the number of times the customer purchased during the observation window
Industry 1 if the prospect is in the B2B industry, 0 otherwise
Revenue Annual sales revenue of the prospect's firm (in millions of dollars)
Employees Number of employees in the prospect's firm
Censor 1 if the customer was still a customer at the end of the observation window, 0 otherwise

In this case the sample data are from a B2B firm that actually observes when a customer churns, that is, a contractual setting. We see from the list of variables that we have one dependent variable (Duration) which is right-censored. Our observation window for the data is only two years (730 days) after the prospects were acquired. Thus, we only observe a customer leaving the firm if it happens before the end of the second year. As a result, the data for Duration is right-censored at 730 days – meaning that any customer whose Duration value is 730 in the data table has yet to leave the firm at the end of the observation window. We also have 10 independent variables which we hope will explain the variation in each new customer's duration with the firm.

These independent variables include the money the firm spent on acquisition (Acq_Expense and Acq_Expense_SQ), the money the firm spent on customer retention (Ret_Expense and Ret_Expense_SQ), customer exchange characteristics (Crossbuy, Frequency, and Frequency_SQ), and firmographic variables (Industry, Revenue, and Employees).

To determine the drivers of customer duration we use the accelerated failure time model similar to Equation 3.22 earlier in this chapter. In this case we have the following equation:

equation

where ln(Duration) is the natural logarithm of the time the customer was active with the firm, X is a matrix of the 10 independent variables, β is a vector of coefficients, σ is the scale parameter, and ε is the error term. We estimate the model by assuming that ε follows a Weibull distribution – common among accelerated failure time models.2 We estimate the model only with the 292 prospects that were acquired as customers. We get the following results from the estimation:

Variable Estimate p-value
Intercept 2.837 <0.0001
Acq_Expense 0.007 <0.0001
Acq_Expense_SQ −0.000 01 <0.0001
Ret_Expense 0.001 <0.0001
Ret_Expense −0.000 000 04 0.017
Crossbuy 0.098 <0.0001
Frequency 0.111 <0.0001
Frequency_SQ −0.001* 0.173
Industry 0.524 <0.0001
Revenue 0.012 <0.0001
Employees 0.0001 <0.0001
Scale 0.138
Shape 7.252
* Denotes not significant at p < 0.05.

We see that each of the variables in the model is significant to at least a level of 1% with the exception of Frequency_SQ. In addition we see that for both Acq_ Expense and Ret_Expense there is a positive, but diminishing, effect on Duration. This suggests that the more the firm spends, up to a threshold, on both acquisition and retention, the longer the customer is likely to stay with the firm. While we expect that spending on retention efforts is directly linked to the duration of a customer, we also see here that spending on acquisition also helps determine the duration of a customer. We see for Crossbuy that the more products the customer has purchased, the longer the expected Duration. We see a positive effect for Frequency with a negative, but not significant, squared term. This suggests that customers who do not purchase very often are more likely to churn than customers who purchase at a moderate to high frequency. We see that customers who are B2B have higher Revenue, and more Employees are more likely to have a longer Duration. Finally, we obtain both a Scale and a Shape parameter which will help us with the prediction of Duration for each of the customers.

Next, we want to see how well we are predicting Duration for the customers. Since we do not actually observe Duration for the customers who have yet to churn, we cannot validate our results on those customers. We can predict the time until expected churn for the customers who have yet to churn, but we can only test to see how well our results accurately predict Duration for the customers who have already churned (unless we used a holdout sample where we actually knew when the customers churned beyond two years).

So, in order to test the predictive accuracy of the model on the 157 non-censored customers who have already churned, we need to have an equation to predict the value of Duration. We start by referring back to the original equation we estimated:

equation

We just need to recognize that σ is the scale parameter (0.138) and ε is derived from the 50% percentile of the Weibull distribution, or ln(−ln(1 − p)) = ln(−ln(1 − 0.5)) =− 0.367. Then to compute Duration we just need take the inverse logarithm of the right hand side. We get the following:

equation

Now we need to compare the actual Duration values for the 157 customers who churned during the observation window to the predicted values of Duration. We find a MAD of 45.97 days and a MAPE of 13.88%. We can compare this to the benchmark case where our prediction of Duration is the mean of the non-censored values, which is 333 days. We get a MAD of 170.77 days and a MAPE of 171.92%. We see that our model does significantly better than the benchmark model in predicting the Duration of customers in our database.

3.4.2 How Do You Implement it?

We implemented this model using the PROC Lifereg procedure in SAS where the dependent variable was right-censored. It is also possible to estimate this model and other accelerated failure time models using other programs such as STATA and R, among others.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset