Customers sign up for telephone service, a banking account, ISP service, cable service, an insurance policy, or electricity service on a particular date and cancel on another date. In all of these cases, the beginning and end of the relationship is well defined.
Other businesses do not have such a continuous relationship. This is particularly true of transactional businesses, such as retailing, Web portals, and catalogers, where each customer’s purchases (or visits) are spread out over time—or
470643 c12.qxd 3/8/04 11:17 AM Page 386
386 Chapter 12
may be one-time only. The beginning of the relationship is clear—usually the first purchase or visit to a Web site. The end is more difficult but is sometimes created through business rules. For instance, a customer who has not made a purchase in the previous 12 months may be considered lapsed. Customer retention analysis can produce useful results based on these definitions. A similar area of application is determining the point in time after which a customer is no longer likely to return (there is an example of this later in the chapter).
The technical side can be more challenging. Consider magazine subscriptions. Do customers start on the date when they sign up for the subscription?
Do customers start when the magazine first arrives, which may be several weeks later? Or do they start when the promotional period is over and they start paying?
Although all three questions are interesting aspects of the customer relationship, the focus is usually on the economic aspects of the relationship. Costs and/or revenue begin when the account starts being used—that is, on the issue date of the magazine—and end when the account stops. For understanding customers, it is definitely interesting to have the original contact date and time, in addition to the first issue date (are customers who sign up on weekdays different from customers who sign up on weekends?), but this is not the beginning of the economic relationship. As for the end of the promotional period, this is really an initial condition or time-zero covariate on the customer relationship.
When the customer signs up, the initial promotional period is known. Survival analysis can take advantage of such initial conditions for refining models.
What a Retention Curve Reveals
Once tenures can be calculated, they can be plotted on a retention curve, which shows the proportion of customers that are retained for a particular period of time. This is actually a cumulative histogram, because customers who have tenures of 3 months are included in the proportions for 1 month and 2 months.
Hence, a retention curve always starts at 100 percent.
For now, let’s assume that all customers start at the same time. Figure 12.1, for instance, compares the retention of two groups of customers who started at about the same point in time 10 years ago. The points on the curve show the proportion of customers who were retained for 1 year, for 2 years, and so on.
Such a curve starts at 100 percent and gradually slopes downward. When a retention curve represents customers who all started at about the same time—
as in this case—it is a close approximation to the survival curve.
Differences in retention among different groups are clearly visible in the chart. These differences can be quantified. The simplest measure is to look at retention at particular points in time. After 10 years, for instance, 24 percent of the regular customers are still around, and only about a third of them even make it to 5 years. Premium customers do much better. Over half make it to 5
years, and 42 percent have a customer lifetime of at least 10 years.
470643 c12.qxd 3/8/04 11:17 AM Page 387
Hazard Functions and Survival Analysis in Marketing 387
100%
90%
80%
High End
70%
Regular
vived
60%
50%
40%
cent Surer 30%
P
20%
10%
0%
0
12
24
36
48
60
72
84
96
108
120
Tenure (Months after Start)
Figure 12.1 Retention curves show that high-end customers stay around longer.
Another way to compare the different groups is by asking how long it takes for half the customers to leave—the customer half-life (although the statistical term is the median customer lifetime). The median is a useful measure because the few customers who have very long or very short lifetimes do not affect it.
In general, medians are not sensitive to a few outliers.
Figure 12.2 illustrates how to find the customer half-life using a retention curve. This is the point where exactly 50 percent of the customers remain, which is where the 50 percent horizontal grid line intersects the retention curve. The customer half-life for the two groups shows a much starker difference than the 10-year survival—the premium customers have a median lifetime of close to 7 years, whereas the regular customers have a median a bit under over 2 years.
Finding the Average Tenure from a Retention Curve
The customer half-life is useful for comparisons and easy to calculate, so it is a valuable tool. It does not, however, answer an important question: “How much, on average, were customers worth during this period of time?”
Answering this question requires having an average customer worth per time and an average retention for all the customers. The median cannot provide this information because the median only describes what happens to the one customer in the middle; the customer at exactly the 50 percent rank. A question about average customer worth requires an estimate of the average remaining lifetime for all customers.
There is an easy way to find the average remaining lifetime: average customer lifetime during the period is the area under the retention curve. There is a clever way of visualizing this calculation, which Figure 12.3 walks through.
470643 c12.qxd 3/8/04 11:17 AM Page 388
388 Chapter 12
100%
90%
80%
70%
High End
vived
60%
Regular
50%
40%
cent Sur
er
30%
P
20%
10%
0%
0
12
24
36
48
60
72
84
96
108
120
Tenure (Months after Start)
Figure 12.2 The median customer lifetime is where the retention curve crosses the 50 percent point.
First, imagine that the customers all lie down with their feet lined up on the left. Their heads represent their tenure, so there are customers of all different heights (or widths, because they are horizontal) for customers of all different tenures. For the sake of visualization, the longer tenured customers lie at the bottom holding up the shorter tenured ones. The line that connects their noses counts the number of customers who are retained for a particular period of time (remember the assumption that all customers started at about the same point in time). The area under this curve is the sum of all the customers’ tenures, since every customer lying horizontally is being counted.
Dividing the vertical axis by the total count produces a retention curve.
Instead of count, there is a percentage. The area under the curve is the total tenure divided by the count of customers—voilà, the average customer tenure during the period of time covered by the chart.
T I P The area under the customer retention curve is the average customer lifetime for the period of time in the curve. For instance, for a retention curve that has 2 years of data, the area under the curve represents the two-year average tenure.
This simple observation explains how to obtain an estimate of the average customer lifetime. There is one caveat when some customers are still active. The average is really an average for the period of time under the retention curve.
Consider the earlier retention curve in this chapter. These retention curves were for 10 years, so the area under the curves is an estimate of the average customer lifetime during the first 10 years of their relationship. For customers who are still active at 10 years, there is no way of knowing whether they will all leave at 10
years plus one day; or if they will all stick around for another century. For this reason, it is not possible to determine the real average until all customers have left.
470643 c12.qxd 3/8/04 11:17 AM Page 389
Hazard Functions and Survival Analysis in Marketing 389
A group of customers with different
tenures are stacked on top of each
other. Each bar represents one
customer.
time
At each point in time, the edges
count the number of customers
active at that time.
Number of
Customers
Notice that the sum of all the areas is
the sum of all the customer tenures.
Making the vertical axis a proportion
instead of a count produces a curve
that looks the same. This is a
tion of
retention curve.
Customers
Propor
The area under the retention curve is
the average customer tenure.
Figure 12.3 Average customer tenure is calculated from the area under the retention curve.
This value, called truncated mean lifetime by statisticians, is very useful. As shown in Figure 12.4, the better customers have an average 10-year lifetime of 6.1 years; the other group has an average of 3.7 years. If, on average, a customer is worth, say, $100 per year, then the premium customers are worth $610 – $370 = $240 more than the regular customers during the 10 years after they start, or about $24 per year. This $24 might represent the return on a retention program designed specifically for the premium customers, or it might give an upper limit of how much to budget for such retention programs.