470643 c18.qxd 3/8/04 11:31 AM Page 614
614 Chapter 18
plan allows. Since the extra minutes are charged at a high rate, these customers end up paying higher bills than they would on a more expensive rate plan with more included minutes. Moving these customers to a higher-rate plan would save them some money, while also increasing the amount of revenue from the fixed portion of their monthly bill.
The Proof of the Pudding
Comcast was able to make a direct cost/benefit analysis of the combined data mining and telemarketing action plan. Armed with this data, Comcast was able to make an informed decision to invest in future data mining efforts. Of course, the story does not really end there; it never does.
The company was faced with a whole new set of questions based on the data that comes back from the initial study. New hypotheses were formed and tested. The response data from the telemarketing effort became fodder for a new round of knowledge discovery. New product ideas and service plans were tried out. Each round of data mining started from a higher base because the company knew its customers better. That is the virtuous cycle of data mining.
Lessons Learned
In a business context, the successful introduction of data mining requires using data mining techniques to address a real business challenge. For companies that are just getting started with analytical customer relationship management, integrating data mining can be a daunting task. A proof-of-concept project is a good way to get started. The proof of concept should create a solid business case for further integration of data mining into the company’s marketing, sales, and customer-support operations. This means that the project should be in an area where it is easy to link improved understanding gained through data mining with improved profitability.
The most successful proof-of-concept projects start with a well-defined business problem, and use data related to that problem to create a plan of action.
The action is then carried out in a controlled manner and the results carefully analyzed to evaluate the effectiveness of the action taken. In other words, the proof of concept should involve one full trip around the virtuous cycle of data mining. If this initial project is successful, it will be the first of many. The primary lesson from this chapter is also an important lesson of the book as a whole: data mining techniques only become useful when applied to meaningful problems. Data mining is a technical activity that requires technical expertise, but its success is measured by its effect on the business.
470643 bindex.qxd 3/8/04 11:08 AM Page 615
Index
A
adjusted error rates, CART
absolute values, distance function, 275
algorithm, 185
accuracy
advertising. See also marketing
classification and predication, 79
campaigns
estimation, 79–81
communication channels,
acquisition
prospecting, 89
acquisition-time data, 108–110
prospects, 90–94
customer relationships, 461–464
word-of-mouth, 283
actions
affinity grouping
actionable data, 516
association rules, 11
actionable results, 22
business goals, formulating, 605
actionable rules, association rules, 296
cross-selling opportunities, 11
control group response versus
data transformation, 57
market research response, 38
undirected data mining, 57
taking control of, 30
affordability, server platforms, 13
activation function, neural
agglomerative clustering, automatic
networks, 222
cluster detection, 368–370
acuity of testing, statistical analysis,
aggregation, confusion and, 48
147–148
aggression, behavior-based
ad hoc questions
variables, 18
behavior-based variables, 585
AI (artificial intelligence), 15
business opportunities,
algorithms, recursive, 173
identifying, 27
alphas, decision trees, 188
hypothesis testing, 50–51
American Express
additive facts, OLAP, 501
as information broker, 16
addresses, geographical resources,
orders, market based analysis, 292
555–556
615
470643 bindex.qxd 3/8/04 11:08 AM Page 616
616 Index
analysis
sensitivity, 247–248
differential response, 107–108
sequential, 318–319
link analysis
statistical
acyclic graphs, 331
acuity of testing, marketing
authorities, 333–334
campaign approaches, 147–148
candidates, 333
business data versus scientific
case study, 343–346
data, 159
classification, 9
censored data, 161
communities of interest, graphs, 346
Central Limit Theorem, 129–130
cyclic graphs, 330–331
chi-square tests, 149–153
data, as graphs, 340
confidence intervals, marketing
directed graphs, 330
campaign approaches, 146
discussed, 321
continuous variables, 137–138
edges, graphs, 322
correlation ranges, 139
fax machines, 337–341
cross-tabulations, 136
graph-coloring algorithm, 340–341
density function, 133
Hamiltonian path, graphs, 328
as disciplinary technique, 123
hubs, 332–334
discrete values, 127–131
Kleinberg algorithm, 332–333
experimentation, 160–161
nodes, graphs, 322
field values, 128
planar graphs, 323
histograms and, 127
root sets, 333
mean values, 137
search programs, 331
median values, 137
stemming, 333
mode values, 137
traveling salesman problem,
multiple comparisons, 148–149
graphs, 327–329
normal distribution, 130–132
vertices, graphs, 322
null hypothesis and, 125–126
weighted graphs, 322, 324
probabilities, 133–135
market based
proportion, standard error of,
differentiation, 289
marketing campaign
discussed, 287
approaches, 139–141
geographic attributes, 293
p-values, 126
item popularity, 293
q-values, 126
item sets, 289
range values, 137
market basket data, 51, 289–291
regression ranges, 139
marketing interventions, tracking,
sample sizes, marketing campaign
293–294
approaches, 145
order characteristics, 292
sample variation, 129
products, clustering by usage,
standard deviation, 132, 138
294–295
standardized values, 129–133
purchases, 289
sum of values, 137–138
support, 301
time series analysis, 128–129
telecommunications customers, 288
truncated data, 162
time attributes, 293
470643 bindex.qxd 3/8/04 11:08 AM Page 617
Index 617
variance, 138
sequential analysis, 318–319
z-values, 131, 138
for store comparisons, 315–316
survival
trivial rules, 297
attrition, handling different types
virtual items, 307
of, 412–413
assumptions, validation, 67
customer relationships, 413–415
attrition
estimation tasks, 10
discussed, 17
forecasting, 415–416
forced, 118
time series
future, 49
neural networks, 244–247
proof-of-concept projects, 599
non-time series data, 246
survival analysis, 412–413
SQL data, 572–573
audio, binary data, 557
statistics, 128–129
authorities, link analysis, 333–334
of variance, 124
automated systems
analysts, responsibilities of, 492–493
neural networks, 213
analytic efforts, wasted time, 27
transaction processing systems, 3–4
AND value, neural networks, 222
automatic cluster detection
angles, between vectors, 361–362
agglomerative clustering, 368–370
anonymous versus identified
case study, 374–378
transactions, association rules, 308
categorical variables, 359
application programming interface
centroid distance, 369
(API), 535
complete linkage, 369
architecture, data mining, 528–532
data preparation, 363–365
artificial intelligence (AI), 15
dimension, 352
assessing models
directed clustering, 372
classifiers and predictors, 79
discussed, 12, 91, 351
descriptive models, 78
distance and similarity, 359–363
directed models, 78–79
divisive clustering, 371–372
estimators, 79–81
evaluation, 372–373
association rules
Gaussian mixture model, 366–367
actionable rules, 296
geometric distance, 360–361
affinity grouping, 11
hard clustering, 367
anonymous versus identified
Hertzsprung-Russell diagram,
transactions, 308
352–354
data quality, 308
K-means algorithm, 354–358
dissociation rules, 317
luminosity, 351
effectiveness of, 299–301
natural association, 358
inexplicable rules, 297–298
scaling, 363–364
point-of-sale data, 288
single linkage, 369
practical limits, overcoming, 311–313
soft clustering, 367
prediction, 70
SOM (self-organizing map), 372
probabilities, calculating, 309
vectors, angles between, 361–362
products, hierarchical categories, 305
weighting, 363–365
zone boundaries, adjusting, 380
470643 bindex.qxd 3/8/04 11:08 AM Page 618
618 Index
auxiliary information, 569–571
neural networks, 227
availability of data, determining,
response, methods of, 146
515–516
untruthful learning sources, 46–47
average member technique, neural
BILL_MASTER file, customer
networks, 252
signatures, 559
averages, estimation, 81
binary churn models, 119
binary classification
B
decision trees, 168
back propagation, feed-forward
misclassification rates, 98
neural networks, 228–232
binary data, 557
backfitting, defined, 170
binning, 237, 551
bad customers, customer relationship
binomial formula (Jacques
management, 18
Bernoulli), 191
bad data formats, data
biological neural networks, 211
transformation, 28
births, house-hold level data, 96
balance transfer programs, industry
bizocity scores, 112–113
revolution, 18
Bonferroni, Carlo (Bonferroni’s
balanced datasets, model sets, 68
correction), 149
balanced sampling, 68
box diagrams, as alternative to
bathtub hazards, 397–398
decision trees, 199–201
behaviors
brainstorming meetings, 37
behavioral segments, marketing
branching nodes, decision trees, 176
campaigns, 111–113
budgets, fixed, marketing campaigns,
behavior-based variables
97–100
ad hoc questions, 585
building models, data mining, 8, 77
aggression, 18
Building the Data Warehouse (Bill
convenience users, 580, 587–589
Inmon), 474
declining usage, 577–579
Business Modeling and Data Mining
estimated revenue, segmenting,
(Dorian Pyle), 60
581–583
businesses
ideals, comparisons to, 585–587
challenges of, identifying, 23–24
potential revenue, 583–585
customer relationship
purchasing frequency, 575–576
management, 2–6
revolvers, 580
customer-centric, 514–515
transactions, 580
forward-looking, 2
future customer behaviors,
home-based, 56
predicting, 10
large-business relationships, 3–4
bell-shaped distribution, 132
opportunities, identifying
benefit, point of maximum, 101
virtuous cycle, 27–28
Bernoulli, Jacques (binomial
wireless communication industries,
formula), 191
34–35
biased sampling
product-focused, 2
confidence intervals, statistical
recommendation-based, 16–17
analysis, 146
small-business relationships, 2
470643 bindex.qxd 3/8/04 11:08 AM Page 619
Index 619
C
car ownership, house-hold level data,
calculations, probabilities, 133–135
96
call detail databases, 37
CART (Classification and Regression
call-center records, useful data
Trees) algorithm, decision trees,
sources, 60
185, 188–189
campaigns, marketing. See also
case studies
advertising
automatic cluster detection, 374–378
acquisitions-time data, 108–110
chi-square tests, 155–158
canonical measurements, 31
decision trees, 206, 208
champion-challenger approach, 139
generic algorithms, 440–443
credit risks, reducing exposure to,
link analysis, 343–346
113–114
MBR (memory-based reasoning),
cross-selling, 115–116
259–262
customer response, tracking, 109
neural networks, 252–254
customer segmentation, 111–113
catalogs
differential response analysis,
response models, decision trees
107–108
for, 175
discussed, 95
retailers, historical customer
fixed budgets, 97–100
behavior data, 5
loyalty programs, 111
categorical variables
new customer information,