When your order, with its item number, size, and color, goes into the cataloger’s order entry system, it spawns still more records in the billing system and the inventory control system. Within hours, your order is also generating transaction records in a computer system at UPS or FedEx where it is scanned about a dozen times between the warehouse and your home, allowing you to check the shipper’s Web site to track its progress.
These transaction records are not generated with data mining in mind; they are created to meet the operational needs of the company. Yet all contain valuable information about customers and all can be mined successfully. Phone companies have used call detail records to discover residential phone numbers whose calling patterns resemble those of a business in order to market special services to people operating businesses from their homes. Catalog companies have used order histories to decide which customers should be included in which future mailings—and, in the case of Victoria’s secret, which models produce the most sales. Federal Express used the change in its customers’
shipping patterns during a strike at UPS in order to calculate their share of their customers’ package delivery business. Supermarkets have used point-of-sale data in order to decide what coupons to print for which customers. Web retailers have used past purchases in order to determine what to display when customers return to the site.
These transaction systems are the customer touch points where information about customer behavior first enters the enterprise. As such, they are the eyes and ears (and perhaps the nose, tongue, and fingers) of the enterprise.
The Role of Data Warehousing
The customer-focused enterprise regards every record of an interaction with a client or prospect—each call to customer support, each point-of-sale transaction, each catalog order, each visit to a company Web site—as a learning opportunity. But learning requires more than simply gathering data. In fact,
470643 c01.qxd 3/8/04 11:08 AM Page 5
Why and What Is Data Mining?
5
many companies gather hundreds of gigabytes or terabytes of data from and about their customers without learning anything! Data is gathered because it is needed for some operational purpose, such as inventory control or billing.
And, once it has served that purpose, it languishes on disk or tape or is discarded.
For learning to take place, data from many sources—billing records, scanner data, registration forms, applications, call records, coupon redemptions, surveys—must first be gathered together and organized in a consistent and useful way. This is called data warehousing. Data warehousing allows the enterprise to remember what it has noticed about its customers.
T I P Customer patterns become evident over time. Data warehouses need to support accurate historical data so that data mining can pick up these critical trends.
One of the most important aspects of the data warehouse is the capability to track customer behavior over time. Many of the patterns of interest for customer relationship management only become apparent over time. Is usage trending up or down? How frequently does the customer return? Which channels does the customer prefer? Which promotions does the customer respond to?
A number of years ago, a large catalog retailer discovered the importance of retaining historical customer behavior data when they first started keeping more than a year’s worth of history on their catalog mailings and the responses they generated from customers. What they discovered was a segment of customers that only ordered from the catalog at Christmas time. With knowledge of that segment, they had choices as to what to do. They could try to come up with a way to stimulate interest in placing orders the rest of the year. They could improve their overall response rate by not mailing to this segment the rest of the year. Without some further experimentation, it is not clear what the right answer is, but without historical data, they would never have known to ask the question.
A good data warehouse provides access to the information gleaned from transactional data in a format that is much friendlier than the way it is stored in the operational systems where the data originated. Ideally, data in the warehouse has been gathered from many sources, cleaned, merged, tied to particular customers, and summarized in various useful ways. Reality often falls short of this ideal, but the corporate data warehouse is still the most important source of data for analytic customer relationship management.
The Role of Data Mining
The data warehouse provides the enterprise with a memory. But, memory is of little use without intelligence. Intelligence allows us to comb through our memories, noticing patterns, devising rules, coming up with new ideas, figuring out
470643 c01.qxd 3/8/04 11:08 AM Page 6
6
Chapter 1
the right questions, and making predictions about the future. This book describes tools and techniques that add intelligence to the data warehouse.
These techniques help make it possible to exploit the vast mountains of data generated by interactions with customers and prospects in order to get to know them better.
Who is likely to remain a loyal customer and who is likely to jump ship?
What products should be marketed to which prospects? What determines whether a person will respond to a certain offer? Which telemarketing script is best for this call? Where should the next branch be located? What is the next product or service this customer will want? Answers to questions like these lie buried in corporate data. It takes powerful data mining tools to get at them.
The central idea of data mining for customer relationship management is that data from the past contains information that will be useful in the future. It works because customer behaviors captured in corporate data are not random, but reflect the differing needs, preferences, propensities, and treatments of customers. The goal of data mining is to find patterns in historical data that shed light on those needs, preferences, and propensities. The task is made difficult by the fact that the patterns are not always strong, and the signals sent by customers are noisy and confusing. Separating signal from noise—recognizing the fundamental patterns beneath seemingly random variations—is an important role of data mining.
This book covers all the most important data mining techniques and the strengths and weaknesses of each in the context of customer relationship management.
The Role of the Customer Relationship
Management Strategy
To be effective, data mining must occur within a context that allows an organization to change its behavior as a result of what it learns. It is no use knowing that wireless telephone customers who are on the wrong rate plan are likely to cancel their subscriptions if there is no one empowered to propose that they switch to a more appropriate plan as suggested in the sidebar. Data mining should be embedded in a corporate customer relationship strategy that spells out the actions to be taken as a result of what is learned through data mining.
When low-value customers are identified, how will they be treated? Are there programs in place to stimulate their usage to increase their value? Or does it make more sense to lower the cost of serving them? If some channels consistently bring in more profitable customers, how can resources be shifted to those channels?
Data mining is a tool. As with any tool, it is not sufficient to understand how it works; it is necessary to understand how it will be used.
470643 c01.qxd 3/8/04 11:08 AM Page 7
Why and What Is Data Mining?
7
DATA MINING SUGGESTS, BUSINESSES DECIDE
This sidebar explores the example from the main text in slightly more detail. An analysis of attrition at a wireless telephone service provider often reveals that people whose calling patterns do not match their rate plan are more likely to cancel their subscriptions. People who use more than the number of minutes included in their plan are charged for the extra minutes—often at a high rate.
People who do not use their full allotment of minutes are paying for minutes they do not use and are likely to be attracted to a competitor’s offer of a cheaper plan.
This result suggests doing something proactive to move customers to the right rate plan. But this is not a simple decision. As long as they don’t quit, customers on the wrong rate plan are more profitable if left alone. Further analysis may be needed. Perhaps there is a subset of these customers who are not price sensitive and can be safely left alone. Perhaps any intervention will simply hand customers an opportunity to cancel. Perhaps a small “rightsizing”
test can help resolve these issues. Data mining can help make more informed decisions. It can suggest tests to make. Ultimately, though, the business needs to make the decision.
What Is Data Mining?
Data mining, as we use the term, is the exploration and analysis of large quantities of data in order to discover meaningful patterns and rules. For the purposes of this book, we assume that the goal of data mining is to allow a corporation to improve its marketing, sales, and customer support operations through a better understanding of its customers. Keep in mind, however, that the data mining techniques and tools described here are equally applicable in fields ranging from law enforcement to radio astronomy, medicine, and industrial process control.