unsupervised learning, 57
non-time series data, 246
untruthful learning sources, 44–48
SQL data, 572–573
UPC (uniform product code), 555
statistics, 128–129
UPS, transaction processing
training sets
systems, 3–4
coverage of values, 232
up-selling
MBR (memory-based reasoning),
customer relationships, 467
TEAMFLY
263–264
marketing campaigns, 111, 115–116
model sets, partitioning, 71
U.S. Census Bureau Web site, 94
optimization as, 230
usage stimulation marketing
uses for, 52
campaigns, 111
transaction data, OLAP, 476–477
user roles, data transformation, 58–60
transaction processing systems,
customer relationship
V
management, 3–4
validation
transactional records, 574
assumptions, 67
transactors, behavior-based
neural networks, 218
variables, 580
validation sets
transfer function, neural
model sets, partitioning, 71
networks, 223
test sets, partitioning, 71
TRANS_MASTER file, customer
uses for, 52
signatures, 559
value added-services, predication
traveling salesman problem, graph
tasks, 10
theory, 327–329
valued outcomes, estimation, 9
trends, capturing, 75
values
triangle inequality, distance
comparing with descriptions, 65
function, 272
with meaning, data correction, 74
trivial rules, association rules, 297
missing, 590–591
truncated data, statistics, 162
Team-Fly®
470643 bindex.qxd 3/8/04 11:08 AM Page 643
Index 643
variables
Web servers
data selection, 63–64
cookies, 109
variable selection problems, neural
transaction processing systems, 3
networks, 233
Web sites
variance
customer response to marketing
analysis of, 124
campaigns, tracking, 109
defined, 81
RuleQuest, 190
neural networks, 199
U.S. Census Bureau, 94
reduction in, splits, decision trees, 183
weight columns, 548
standard deviation and, 138
weighted graphs, graph theory,
statistics, 138
322, 324
variations, percent, 105
weighted voting, 281–282
vectors, angles between, 361–362
weighting, automatic cluster
vendor credibility, 537
detection, 363–365
virtual items, association rules, 307
welcome periods, loyalty
virtuous cycle
programs, 518
action tasks, 30
well-defined distance, distance
business opportunities, identifying,
function, 271
27–28
winback approach, customer
data transformation, 28–30
relationships, 470
discussed, 28
wireless communications industries
results, measuring, 30–32
business opportunities, identifying,
stages of, 26
34–35
visualization tools, data exploration, 65
MOU (minutes of use), 38
voice node, fax machines, 341
rate plans, finding appropriate, 7
voice recognition, free text
women, differential response analysis
resources, 556
and, 107
voluntary churn, 118–119, 521
word-of-mouth advertising, 283
voting
positive ratings, 284
Z
weighted, 281–282
zip codes
as categorical value, 239
W
distance function, 276–277
warehouses, searching data in, 61–62
zone boundaries, adjusting, using
warranty claims data, useful data
automatic cluster detection, 380
sources, 60
z-scores, 551
web crawlers, spiders, 331
z-values, statistics, 131, 138
Web pages
classification, 9
useful data sources, 60