Fortunately, the ideas are simpler than the mathematics behind his approach.
Instead of focusing on hazards, he introduces the idea of partial likelihood.
Assuming that only one customer stops at a given time t, the partial likelihood at t is the likelihood that exactly that particular customer stopped.
470643 c12.qxd 3/8/04 11:17 AM Page 411
Hazard Functions and Survival Analysis in Marketing 411
The calculation for the partial likelihood divides whatever function or value represents the hazard for the specific customer that stopped by the sum of all the hazards for all the customers who might have stopped at that time. If all customers had the same hazard rates, then this ratio would be constant (one divided by the population at that point in time). However, the hazards are not constant and hopefully are some function of the initial conditions.
Cox made an assumption that the initial conditions have a constant effect on all hazards, regardless of the time of the hazard. The partial likelihood is a ratio, and the proportionality assumption means that the hazards, whatever they are, appear in both the numerator and denominator multiplied by a complicated expression based on the initial conditions. What is left is a complicated mathematical formula containing the initial conditions. The hazards themselves have disappeared from the partial likelihood; they simply cancel each other out.
The next step is to apply the partial likelihoods of all customers who stop to get the overall likelihood of those particular customers stopping. The product of all these partial likelihoods is an expression that gives the likelihood of seeing exactly the particular set of stopped customers stopping when they did.
Conveniently, this likelihood is also expressed only terms of the initial conditions and not in terms of the hazards, which may not be known.
Fortunately, there is an area of statistics called maximum likelihood estimation, which when given a complicated expression for something like this finds the parameter values that make the result most likely. These parameter values conveniently represent the effect of the initial values on the hazards. As an added bonus, the technique works both with continuous and categorical values, whereas the stratification approach only works with categorical values.
Limitations of Proportional Hazards
Cox proportional hazards regression is very powerful and very clever.
However, it has its limitations. In order for all this to work, Cox had to make many assumptions. He designed his approach around continuous time hazards and also made the assumption that only one customer stops at any given time. With some tweaking, implementations of proportional hazards regression usually work for discrete time hazards and handle multiple stops at the same time.
WA R N I N G Cox proportional hazards regression ranks and quantifies the effects of initial conditions on the overall hazard function. However, the results are highly dependent on the often dubious assumption that the initial conditions have a constant effect on the hazards over time. Use it carefully.
470643 c12.qxd 3/8/04 11:17 AM Page 412
412 Chapter 12
The biggest assumption in the proportional hazards model is the assumption of proportionality itself. That is, that the effect of the initial conditions on hazards does not have a time component. In practice, this is simply not true. It is rarely, if ever, true that initial conditions have such perfect proportionality, even in the scientific world. In the world of marketing, this is even less likely.
Marketing is not a controlled experiment. Things are constantly changing; new programs, pricing, and competition are always arising.
The bad news is that there is no simple algorithm that explains initial conditions, taking into account different effects over time. The good news is that it often does not make a difference. Even with the assumption of proportionality, Cox regression does a good job of determining which covariates have a big impact on the hazards. In other words, it does a good job of explaining what initial conditions are correlated with customers leaving.
Cox’s approach was designed only for time-zero covariates, as statisticians call initial values. The approach has been extended to handle events that occur during a customer’s lifetime—such as whether they upgrade their product or make a complaint. In the language of statistics, these are time-dependent covariates, meaning that the additional factors can occur at any point during the customer’s tenure, not only at the beginning of the relationship. Such factors might be a customer’s response to a retention campaign or making complaints. Since Cox’s original work, he and other statisticians have TEAMFLY
extended this technique to include these types of factors.
Survival Analysis in Practice
Survival analysis has proven to be very valuable for understanding customers and quantifying marketing efforts in terms of customer retention. It provides a way of estimating how long it will be until something occurs. This section gives some particular examples of survival analysis.
Handling Different Types of Attrition
Businesses that deal with customers have to deal with customers leaving for a variety of reasons. Earlier, this chapter described hazard probabilities and explained how hazards illustrate aspects of the business that affect the customer life cycle. In particular, peaks in hazards coincided with business processes that forced out customers who were not paying their bills.
Since these customers are treated differently, it is tempting to remove them entirely from the hazard calculation. This is the wrong approach. The problem is, which customers to remove is only known after the customers have been Team-Fly®
470643 c12.qxd 3/8/04 11:17 AM Page 413
Hazard Functions and Survival Analysis in Marketing 413
forced to stop. As mentioned earlier, it is not a good idea to use such knowledge, gained at the end of the customer relationship, to filter customers for analysis.
The right approach is to break this into two problems. What are the hazards for voluntary attrition? What are the hazards for forced attrition? Each of these uses all the customers, censoring the customers who leave due to other factors.
When calculating the hazards for voluntary attrition, whenever a customer is forced to leave, the customer is included in the analysis until he or she leaves—
at that point, the customer is censored. This makes sense. Up to the point when the customer was forced to leave, the customer did not leave voluntarily.
This approach can be extended for other purposes. Once upon a time, the authors were trying to understand different groups of customers at a newspaper, in particular, how survival by acquisition channel was or was not changing over time. Unfortunately, during one of the time periods, there was a boycott of the newspaper, raising the overall stop levels during that period.
Not surprisingly, the hazards went up and survival decreased during this time period.
Is there a way to take into account these particular stops? The answer is
“yes,” because the company did a pretty good job of recording the reasons why customers stopped. The customers who boycotted the paper were simply censored on the day they stopped—as they say in the medical world, these customers were lost to follow-up. By censoring, it was possible to get an accurate estimate of the overall hazards without the boycott.
When Will a Customer Come Back?
So far, the discussion of survival analysis has focused on the end of the customer relationship. Survival analysis can be used for many things besides predicting the probability of bad things happening. For instance, survival analysis can be used to estimate when customers will return after having stopped.
Figure 12.12 shows a survival curve and hazards for reactivation of customers after they deactivate their mobile telephone service. In this case, the hazard is the probability that a customer returns a given number of days after the deactivation.
There are several interesting features in these curves. First, the initial reactivation rate is very high. In the first week, more than a third of customers reactivate. Business rules explain this phenomenon. Many deactivations are due to customers not paying their bills. Many of these customers are just holding out until the last minute—they actually intend to keep their phones; they just don’t like paying the bill. However, once the phone stops working, they quickly pay up.
470643 c12.qxd 3/8/04 11:17 AM Page 414
414 Chapter 12
100%
10%
90%
9%
(“Risk” of Reactiv
80%
8%
Hazar
70%
7%
d Pr
ated)
60%
6%
al
obability
viv
50%
5%
Sur
40%
4%
ating)
30%
3%
20%
2%
(Remain Deactiv
10%
1%
0%
0%
0
30
60
90
120
150
180 210
240 270
300 330
360
Days after Deactivation
Figure 12.12 Survival curve (upper curve) and hazards (lower curve) for reactivation of mobile telephone customers.
After 90 days, the hazards are practically zero—customers do not reactivate.
Once again, the business processes provide guidance. Telephone numbers are reserved for 90 days after customers leave. Normally, when customers reactivate, they want to keep the same telephone number. After 90 days, the number may have been reassigned, and the customer would have to get a new telephone number.
This discussion has glossed over the question of how new (reactivated) customers were associated with the expired accounts. In this case, the analysis used the telephone numbers in conjunction with an account ID. This pretty much guaranteed that the match was accurate, since reactivated customers retained their telephone numbers and billing information. This is very conservative but works for finding reactivations. It does not work for finding other types of winback, such as customers who are willing to cycle through telephone numbers in order to get introductory discounts.