Berry M.J.A. – Data Mining Techniques For Marketing, Sales & Customer Relationship Management – Page 94 – Library. Read online. Free books read online. Read books without registering

except for Waltham and Watertown which are in Cluster 1B. Swapping Brookline into West 1 and Watertown and Waltham into City would make it possible for both editorial zones to be pure in the sense that all the towns in each zone would share the same cluster assignment. The new West 1 would be all Cluster 2, and the new City would be all Cluster 1B. As can be seen in the map in Figure 11.12, the new zones are still geographically contiguous.

Having editorial zones composed of similar towns makes it easier for the Globe to provide sharper editorial focus in its localized content, which should lead to higher circulation and better advertising sales.

Table 11.1 Towns in the City and West 1 Editorial Zones TOWN EDITORIAL

ZONE

CLUSTER

ASSIGNMENT

Brookline

City

Boston

City

Cambridge

City

Somerville

City

Needham

West 1

Newton

West 1

Wellesley

West 1

Waltham

West 1

Weston

West 1

Watertown

West 1

470643 c11.qxd 3/8/04 11:17 AM Page 381

Automatic Cluster Detection 381

Lessons Learned

Automatic cluster detection is an undirected data mining technique that can be used to learn about the structure of complex databases. By breaking complex datasets into simpler clusters, automatic clustering can be used to improve the performance of more directed techniques. By choosing different distance measures, automatic clustering can be applied to almost any kind of data. It is as easy to find clusters in collections of news stories or insurance claims as in astronomical or financial data.

Clustering algorithms rely on a similarity metric of some kind to indicate whether two records are close or distant. Often, a geometric interpretation of distance is used, but there are other possibilities, some of which are more appropriate when the records to be clustered contain non-numeric data.

One of the most popular algorithms for automatic cluster detection is K-means. The K-means algorithm is an iterative approach to finding K clusters based on distance. The chapter also introduced several other clustering algorithms. Gaussian mixture models, are a variation on the K-means idea that allows for overlapping clusters. Divisive clustering builds a tree of clusters by successively dividing an initial large cluster. Agglomerative clustering starts with many small clusters and gradually combines them until there is only one cluster left. Divisive and agglomerative approaches allow the data miner to use external criteria to decide which level of the resulting cluster tree is most useful for a particular application.

This chapter introduced some technical measures for cluster fitness, but the most important measure for clustering is how useful the clusters turn out to be for furthering some business goal.

470643 c11.qxd 3/8/04 11:17 AM Page 382

TEAMFLY

Team-Fly®

470643 c12.qxd 3/8/04 11:17 AM Page 383

C H A P T E R

Knowing When to Worry:

Hazard Functions and Survival

Analysis in Marketing

Hazards. Survival. These very terms conjure up scary images, whether a shimmering-blue, ball-eating golf hazard or something a bit more frightful from a Stephen King novel, a hatchet movie, or some reality television show.

Perhaps such dire associations explain why these techniques are not frequently associated with marketing.

If so, this is a shame. Survival analysis, which is also called time-to-event analysis, is nothing to worry about. Exactly the opposite: survival analysis is very valuable for understanding customers. Although the roots and terminology come from medical research and failure analysis in manufacturing, the concepts are tailor made for marketing. Survival tells us when to start worrying about customers doing something important, such as ending their relationship. It tells us which factors are most correlated with the event. Hazards and survival curves also provide snapshots of customers and their life cycles, answering questions such as: “How much should we worry that this customer is going to leave in the near future?” or “This customer has not made a purchase recently; is it time to start worrying that the customer will not return?”

The survival approach is centered on the most important facet of customer behavior: tenure. How long customers have been around provides a wealth of information, especially when tied to particular business problems. How long customers will remain customers in the future is a mystery, but a mystery that past customer behavior can help illuminate. Almost every business recognizes the value of customer loyalty. As we see later in this chapter, a guiding principle 383

470643 c12.qxd 3/8/04 11:17 AM Page 384

384 Chapter 12

of loyalty—that the longer customers stay around, the less likely they are to stop at any particular point in time—is really a statement about hazards.

The world of marketing is a bit different from the world of medical research.

For one thing, the consequences of our actions are much less dire: a patient may die from poor treatment, whereas the consequences in marketing are merely measured in dollars and cents. Another important difference is the volume of data. The largest medical studies have a few tens of thousands of participants, and many draw conclusions from a just a few hundred. When trying to determine mean time between failure (MTBF) or mean time to failure (MTTF)—manufacturing lingo for how long to wait until an expensive piece of machinery breaks down—conclusions are often based on no more than a few dozen failures.

In the world of customers, tens of thousands is the lower limit, since customer databases often contain data on millions of customers and former customers. Much of the statistical background of survival analysis is focused on extracting every last bit of information out of a few hundred data points. In data mining applications, the volumes of data are so large that statistical concerns about confidence and accuracy are replaced by concerns about managing large volumes of data.

The importance of survival analysis is that it provides a way of understanding time-to-event characteristics, such as:

■■

When a customer is likely to leave

■■

The next time a customer is likely to migrate to a new customer segment

■■

The next time a customer is likely to broaden or narrow the customer relationship

■■

The factors in the customer relationship that increase or decrease likely tenure

■■

The quantitative effect of various factors on customer tenure

These insights into customers feed directly into the marketing process. They make it possible to understand how long different groups of customers are likely to be around—and hence how profitable these segments are likely to be.

They make it possible to forecast numbers of customers, taking into account both new acquisition and the decline of the current base. Survival analysis also makes it possible to determine which factors, both those at the beginning of customers’ relationships as well as later experiences, have the biggest effect on customers’ staying around the longest. And, the analysis can be applied to things other then the end of the customer tenure, making it possible to determine when another event—such as a customer returning to a Web site—is no longer likely to occur.

A good place to start with survival is with visualizing customer retention, which is a rough approximation of survival. After this discussion, we move on to hazards, the building blocks of survival. These are in turn combined into

470643 c12.qxd 3/8/04 11:17 AM Page 385

Hazard Functions and Survival Analysis in Marketing 385

survival curves, which are similar to retention curves but more useful. The chapter ends with a discussion of Cox Proportional Hazard Regression and other applications of survival analysis. Along the way, the chapter provides particular applications of survival in the business context. As with all statistical methods, there is a depth to survival that goes far beyond this introductory chapter, which is consciously trying to avoid the complex mathematics underlying these techniques.

Customer Retention

Customer retention is a concept familiar to most businesses that are concerned about their customers, so it is a good place to start. Retention is actually a close approximation to survival, especially when considering a group of customers who all start at about the same time. Retention provides a familiar framework to introduce some key concepts of survival analysis such as customer half-life and average truncated customer tenure.

Calculating Retention

How long do customers stay around? This seemingly simple question becomes more complicated when applied to the real world. Understanding customer retention requires two pieces of information:

■■

When each customer started

■■

When each customer stopped

The difference between these two values is the customer tenure, a good measurement of customer retention.

Any reasonable database that purports to be about customers should have this data readily accessible. Of course, marketing databases are rarely simple.

There are two challenges with these concepts. The first challenge is deciding on what is a start and stop, a decision that often depends on the type of business and available data. The second challenge is technical: finding these start and stop dates in available data may be less obvious than it first appears.

For subscription and account-based businesses, start and stop dates are well understood. Customers start magazine subscriptions at a particular point in time and end them when they no longer want to pay for the magazine.