All three levels of market basket data are important. For instance, to understand orders, there are some basic measures:
■■
What is the average number of orders per customer?
■■
What is the average number of unique items per order?
■■
What is the average number of items per order?
■■
For a given product, what is the proportion of customers who have ever purchased the product?
470643 c09.qxd 3/8/04 11:15 AM Page 291
Market Basket Analysis and Association Rules 291
■■
For a given product, what is the average number of orders per customer that include the item?
■■
For a given product, what is the average quantity purchased in an order when the product is purchased?
These measures give broad insight into the business. In some cases, there are few repeat customers, so the proportion of orders per customer is close to 1; this suggests a business opportunity to increase the number of sales per customers. Or, the number of products per order may be close to 1, suggesting an opportunity for cross-selling during the process of making an order.
It can be useful to compare these measures to each other. We have found that the number of orders is often a useful way of differentiating among customers; good customers clearly order more often than not-so-good customers. Figure 9.3 attempts to look at the breadth of the customer relationship (the number of unique items ever purchased) by the depth of the relationship (the number of orders) for customers who purchased more than one item. This data is from a small specialty retailer. The biggest bubble shows that many customers who purchase two products do so at the same time. There is also a surprisingly large bubble showing that a sizeable number of customers purchase the same product in two orders. Better customers—at least those who returned multiple times—tend to purchase a greater diversity of goods. However, some of them are returning and buying the same thing they bought the first time. How can the retailer encourage customers to come back and buy more and different products? Market basket analysis cannot answer the question, but it can at least motivate asking it and perhaps provide hints that might help.
10
9
s
8
oducts
der
7
6
5
4
oss All Or
3
Acr
2
Num Distincts Pr
1
0
0
1
2
3
4
5
6
Num Orders
Figure 9.3 This bubble plot shows the breadth of customer relationships by the depth of the relationship.
470643 c09.qxd 3/8/04 11:15 AM Page 292
292 Chapter 9
Order Characteristics
Customer purchases have additional interesting characteristics. For instance, the average order size varies by time and region—and it is useful to keep track of these to understand changes in the business environment. Such information is often available in reporting systems, because it is easily summarized.
Some information, though, may need to be gleaned from transaction-level data. Figure 9.4 breaks down transactions by the size of the order and the credit card used for payment—Visa, MasterCard, or American Express—for another retailer. The first thing to notice is that the larger the order, the larger the average purchase amount, regardless of the credit card being used. This is reassuring.
Also, the use of one credit card type, American Express, is consistently associated with larger orders—an interesting finding about these customers.
For Web purchases and mail-order transactions, additional information may also be gathered at the point of sale:
■■
Did the order use gift wrap?
■■
Is the order going to the same address as the billing address?
■■
Did the purchaser accept or decline a particular cross-sell offer?
Of course, gathering information at the point of sale and having it available for analysis are two different things. However, gift giving and responsiveness TEAMFLY
to cross-sell offers are two very useful things to know about customers. Finding patterns with this information requires collecting the information in the first place (at the call center or through the online interface) and then moving it to a data mining environment.
$1,500
American Express
MasterCard
$1,250
Visa
$1,000
der Amount
$750
e Or
g
$500
veraA
$250
$0
1
2
3
4
5
6
7
8
9
Number of Items Purchased
Figure 9.4 This chart shows the average amount spent by credit card type based on the number of items in the order for one particular retailer.
Team-Fly®
470643 c09.qxd 3/8/04 11:15 AM Page 293
Market Basket Analysis and Association Rules 293
Item Popularity
What are the most popular items? This is a question that can usually be answered by looking at inventory curves, which can be generated without having to work with transaction-level data. However, knowing the sales of an individual item is only the beginning. There are related questions:
■■
What is the most common item found in a one-item order?
■■
What is the most common item found in a multi-item order?
■■
What is the most common item found among customers who are repeat purchasers?
■■
How has the popularity of particular items changed over time?
■■
How does the popularity of an item vary regionally?
The first three questions are particularly interesting because they may suggest ideas for growing customer relationships. Association rules can provide answers to these questions, particularly when used with virtual items to represent the size of the order or the number of orders a customer has made.
The last two questions bring up the dimensions of time and geography, which are very important for applications of market basket analysis. Different products have different affinities in different regions—something that retailers are very familiar with. It is also possible to use association rules to start to understand these areas, by introducing virtual items for region and seasonality.
T I P Time and geography are two of the most important attributes of market basket data, because they often point to the exact marketing conditions at the time of the sale.
Tracking Marketing Interventions
As discussed in Chapter 5, looking at individual products over time can provide a good understanding of what is happening with the product. Including marketing interventions along with the product sales over time, as in Figure 9.5, makes it possible to see the effect of the interventions. The chart shows a sales curve for a particular product. Prior to the intervention, sales are hovering at 50 units per week. After the intervention, they peak at about seven or eight times that amount, before gently sliding down over the six or seven weeks. Using such charts, it can be possible to measure the response of the marketing effort.
470643 c09.qxd 3/8/04 11:15 AM Page 294
294 Chapter 9
450
Mail Drop
400
350
300
250
200
150
100
50
0
5
2
9
6
y 03
y 10
y 17
y 24
y 31
l 0
l 1
l 1
l 2
un 07
un 14
un 21
un 28
Ju
Ju
Ju
Ju
ug 02
Mar 01
Mar 08
Mar 15
Mar 22
Mar 29
Apr 05
Apr 12
Apr 19
Apr 26
Ma
Ma
Ma
Ma
Ma
J
J
J
J
A
Figure 9.5 Showing marketing interventions and product sales on the same chart makes it possible to see effects of marketing efforts.
Such analysis does not require looking at individual market baskets—daily or weekly summaries of product sales are sufficient. However, it does require knowing when marketing interventions take place—and sometimes getting such a calendar is the biggest challenge. One of the questions that such a chart can answer is the effect of the intervention. A challenge in answering this question is determining whether the additional sales are incremental or are made by customers who would purchase the product anyway at some later time.
Market basket data can start to answer this question. In addition to looking at the volume of sales after an intervention, we can also look at the number of baskets containing the item. If the number of customers is not increasing, there is evidence that existing customers are simply stocking up on the item at a lower cost.
A related question is whether discounting results in additional sales of other products. Association rules can help answer this question by finding combinations of products that include those being promoted during the period of the promotion. Similarly, we might want to know if the average size of orders increases or decreases after an intervention. These are examples of questions where more detailed transaction level data is important.
Clustering Products by Usage
Perhaps one of the most interesting questions is what groups of products often appear together. Such groups of products are very useful for making recommendations to customers—customers who have purchased some of the products may be interested in the rest of them (Chapter 8 talks about product
470643 c09.qxd 3/8/04 11:15 AM Page 295
Market Basket Analysis and Association Rules 295
recommendations in more detail). At the individual product level, association rules provide some answers in this area. In particular, this data mining technique determines which product or products in a purchase suggest the purchase of other particular products at the same time.