SCENE 1: Its late November
1999. The Celtics are struggling with their second lineup.
In a typical game, the team can be up by 14 points; and
when the second unit comes in, the lead is lost. It is time
for Frank Vogel to come into play. Vogel, the Celtics
video coordinator, is in charge of running the game statistics
through Advanced Scout, a data analysis package developed
by IBM. Vogels research confirms the coaches
observations: The second units defense is holding
up, but the offense is failing. More important, the statistics
tell him that in situations where one of the star players,
Paul Pierce or Antoine Walker, is moved to the second unit,
there is no drop-off in the performance of the first unit,
and the production of the second unit increases. His recommendations
help the coaches formulate their new strategy.
SCENE 2: A patient has been having ulcer problems.
He goes to the doctor and then buys the prescription she
recommends at the local pharmacy. End of story? No. The
record of the transaction the drug bought, the location
of the purchase, the value paid, and the name of the prescriber
(none of which includes any of the patients identifying
information) goes to IMS America, one of the largest
pharmaceutical market research companies in the world. The
transaction is added to a database of over 1.5 billion prescriptions
generated that year from over 33,000 retail pharmacies (and
medical mail order) which matches the prescriptions to over
600,000 physicians. With these data, the company can track
which physicians have changed their prescribing behavior,
and pharmaceutical companies can fine-tune their ulcer-drug
marketing campaign: which physicians should be visited by
medical reps and which should just receive an informational
package.
What do the two episodes have in common? They both include
the use of data-mining computer technology to search for
patterns in data. In the case of the drugstore prescription,
the software can look within the prescribing habits of doctors,
in particular therapeutic classes, to determine the characteristics
of doctors who tend to be brand switchers and
of those who tend to be loyal to particular
brands.
In the case of the Celtics, the statistical package allows
Vogel to determine which specific lineups are most effective
against another teams lineup, and under what circumstances
a players potential is maximized. Although data mining
by itself is not going to get the Celtics to the playoffs,
Vogel can, for instance, run queries to find whether Antoine
Walker is more effective with Dana Barros or with Kenny
Anderson, by matching all of Walkers game minutes
with each of the point guards. The coaches can observe examples
of the findings on the games video, since the program
lists exact times when the studied sequence was in play.
It definitely has found some trends we hadnt
recognized, says Jay Wessel, director of technology
for the Celtics.
One does not have to be an NBA player to have ones
actions analyzed by statistical software designed to detect
patterns. In fact, these sorts of applications have increasingly
come to be used in many more mundane transactions. As we
go about our modern lives, we leave a trail of data behind.
Supermarket purchases, bank transactions, credit card purchases,
phone calls, retail catalog orders, and each click of the
mouse can be recorded, stored, and analyzed. These data
reveal information about who we are: our habits, our preferences,
and what we are interested in. And, this information means
money. That is, if companies can figure out what to do with
it.
HOW IT WORKS
For the most part, what companies use data mining for is
not new. Assistant coaches have had the responsibility of
keeping an eye out for more effective team combinations
just as pharmaceutical companies have been marketing their
products to doctors for decades and both have been
analyzing data in order to make these decisions. But data-mining
programs allow them to analyze greater amounts of data faster
and potentially more efficiently. And, in some cases, the
more widespread use of data-mining techniques has meant
that data being collected for one purpose (or without the
individuals awareness) is being used for another purpose.
Present-day data-mining technology and its application
to business developed almost as an afterthought. The field,
which now has its own journal, an annual conference (this
years is in Boston), and at least two regular newsletters,
has been cobbled together from several other domains, including
machine learning, statistics, and decision support.
According to data-mining experts Michael Berry and Gordon
Linoff, in the early 1980s, researchers in machine learning
a subset of artificial intelligence focusing on writing
software that allows computers to learn by example
began looking for commercial applications when funding for
artificial intelligence research dried up. Statisticians,
for their part, had been developing the theoretical underpinnings
for predictive modeling, sampling methodologies, and experimental
design.
At the same time thanks to improvements in computers
and data storage capacity, and the development of new technologies
such as scanners companies found themselves sitting
on top of piles of data. The NBA had been collecting game
statistics long before anyone thought of mining them, and
credit card companies habitually recorded for billing purposes
what, when, and where purchases were made. Similarly, supermarkets
had introduced scanners and bar codes to eliminate the need
to price items individually on the shelf and accelerate
the checkout process. It was just a matter of time before
the developers of data-mining technology realized that they
could use these data to do such things as keep track of
the combinations of products individual people bought, the
time of day certain products were likely to be purchased,
or the way people responded to special offers such as coupons.
Since then, firms that specialize in data-mining software
have been developing a variety of techniques, depending
on the particular problem and quality of the data. Clustering
is one example of what Gordon Linoff calls undirected
data mining, in which a program is designed to find possible
associations and similarities in the data without any specific
guidelines. In market basket analysis, for instance, the
program indicates affinities among certain products that
tend to be bought together, say a particular kind of golf
ball with a particular type of club. The software can also
be used to sort people into groups according to shared characteristics,
whether they be demographic facts, known political attitudes,
or past purchases (to name just a few). Thus is born the
soccer mom or the conservative retiree.
Using this so-called psychographic information
can help companies to better target and tailor their products
and marketing messages to particular groups.
The patterns discovered are then evaluated by a human
analyst to decide whether the groupings give some useful
information. Ideally, such a program would be able to pick
out some previously unknown and perhaps even counterintuitive
correlations that have high market value. But this is not
always the case. Sometimes, the groupings can be so obvious
that they offer no new insight; in others, they can come
up with correlations that are hard to capitalize for marketing
purposes and may even be spurious. (One widely cited example
is that people who buy diapers also tend to buy beer.)
In the end, it is up to the business analysts to decide
how to make use of the information to place the golf
clubs with the balls in the same rack or to offer a coupon
for the balls with the purchase of the club, for instance.
Directed methods of data mining are more widespread.
In this case, the algorithms track patterns associated with
very specific results: patterns associated with credit card
fraud, for example. Nestor, Inc., a company based in Providence,
Rhode Island, uses a technique called neural networks
to answer just that question for their banking clients.
The software learns to recognize customers card-use
patterns, which allows it to automatically detect deviations
that may represent fraudulent transactions. To develop the
model, Nestor asks the client to provide transaction data
accumulated over four months, including every known fraudulent
transaction. First, the algorithms identify characteristics
and patterns that are likely indicators of fraud. Then,
the resulting model is fine-tuned by asking it to predict
fraud in a new batch of data. Finally, these results are
compared to the known instances of fraud so that the system
can learn from the differences. It is like a very
small brain whose total knowledge is credit card fraud;
you are repeatedly exposing the network to these patterns
that are strengthening these connections, says Bernard
Chartier, director of modeling services for Nestor. A 60
to 70 percent detection rate is considered very good.
Once the model is in use, clients can load in new fraud
data to refresh it. Still, Nestor recommends a complete
retraining of client systems every 18 months. Says Nestors
director of worldwide marketing, Tom Spillane, Fraud
is a transient behavior that is always changing.
UP CLOSE AND PERSONAL
Small businesses that dont have a lot of clients
dont have a need for massive data mining. Their owners
and operators often know the needs and tastes of their clients
better than a computer could. But in larger firms, particularly
in industries that naturally accumulate large amounts of
detailed transaction data, such as firms in banking, insurance,
telecommunications, catalog retail, utilities, and supermarkets,
applications of data mining are increasingly widespread.
Perhaps the most common application of data mining
and one of the ones that has been around longest (since
the 1950s) is credit scoring, a statistical method
used to predict the probability that a loan applicant or
existing borrower will default or become delinquent. Credit
scoring is now widely used for consumer lending, particularly
with credit cards and mortgage loans, and is becoming more
common in small business lending (see
sidebar).
Companies also mine their customer data to try to figure
out who their best customers are, and what products they
are likely to buy. They then use that information to buy
lists of potential customers with the identified characteristics
or to pitch products and promotions to particular segments
of their client pool. Many of the unsolicited phone calls,
letters, and e-mails that enter our lives on a daily basis
originate in this way.
The Vermont Country Store, a small family-owned business
based in Manchester Center, Vermont, whose catalogs offer
items that are now hard to find, such as Ovaltine and Olivetti
manual typewriters, has used data mining to increase the
effectiveness of their catalog mailings. Although he wont
say by how much, vice president of marketing Larry Shaw
affirms that the companys mailings have increased
in profitability since it started applying data-mining techniques
on its more established catalogs. Some good predictors of
how valuable a customer is to the company turned out to
be quite straightforward: how recently and how many times
a customer has ordered, for instance. But others were more
surprising. [We found that] someone who buys products
from different categories [e.g., food and housewares]
is more profitable as a customer than someone who buys the
same number of products from the same category, says
Shaw. The company has four different catalogs, and data
mining has been successful in matching the best catalog
for each customer.
Such targeted marketing is only a small part of customer
relationship management (CRM), the latest trend in data
mining. Mining their customer transaction data often
augmented with additional demographic information
companies calculate a lifetime value for each
customer. Knowing the traits of their most valuable customers
not only allows firms to try to acquire more customers with
similar characteristics, but also lets them know which customers
they want to spend money and effort on retaining and which
ones they are willing to let go. It also flags cross-sell
opportunities, telling companies which of their customers
are most likely to purchase from other product lines. And,
it helps single out lower-profit-segment customers who are
likely candidates to be upgraded to a higher platinum
or gold category. (Companies can then mail them
offers for higher value products, expand their credit line,
or entice them to join frequent-buyer rewards programs,
for instance.)
In practice, this means that companies have to ensure
that they can recognize the customer through any of their
channels of contact, whether in a branch, on the web, or
through the call centers different departments (sales,
customer service, and so on), so that they can treat the
customer accordingly. For instance, according to Business
Week, Sanwa Bank segments its customers into three categories,
A, B, and C. When you call or e-mail the company to ask
the bank to waive the fee on a bounced check, the customer
representative gets your score within seconds. If you are
among the businesss most profitable customers (an
A), the fee is waived without questions. If you are in the
least profitable category, you are less likely to be forgiven.
Somewhere in the middle, and you will have to negotiate
with the representative.
Sanwa Bank offers just one example of what many other
banks are doing. This type of segmentation can affect even
how long you have to wait on the phone. In some firms, automated
systems can recognize the customers category the moment
the caller punches in an account number or any other identifying
information. If you are in the most profitable category,
you may be automatically jumped to the head of the queue,
leaving less profitable customers on hold. Or, if you have
been identified as a potential buyer of additional services,
your call may be routed to a representative with experience
in selling that product regardless of the motive
for your call.
Companies are understandably reluctant to disclose the
effectiveness of their customer relationship management
strategy. But, according to a recent Wall Street Journal
article, Harrahs Entertainment Inc., a Las Vegas-based
casino business, has seen its revenues more than double.
Using information gathered through electronic frequent-gambler
cards, the casino learned that gamblers who spent
a relatively modest $100 to $499 per trip, about 30 percent
of gamblers, accounted for the majority of the casinos
profits. Armed with this information, the casino proceeded
to experiment on how to increase this groups loyalty
at least cost, by testing different promotions on them.
For instance, they found that an offer of $60 in chips got
people to gamble much more than the more expensive promotional
$125 package of a free room, two meals, and $30 in chips.
And, the rate at which people responded to their mail offers
more than doubled, from 3 to 8 percent.
Fraud detection, credit scoring, targeted marketing, and
customer relationship management applications are now the
most common applications of data mining. But, as the NBA
example showed, they are not the only ones. Data-mining
techniques are also being used to improve manufacturing
processes, develop new drugs, and relate information in
the human genome to particular diseases, among other things.
New and novel applications are constantly appearing. The
ability to mine files of text such as e-mails or news reports
is one of the promising fields, according to Mark Brown,
SAS global data mining program manager. For example, Nestor
has already been approached by a client to develop a program
that would assess the content of customer complaint calls
and help predict which are likely to result in litigation.
The INTERNET: DATA MINING GOES ON STEROIDS
But it is on the Internet where data-mining applications
are creating the greatest stir. As Jeff Averick, data quality
specialist with DiscJockey.com in Salem, Massachusetts,
puts it, when it comes to the Internet, this thing
goes on steroids. With cookies tiny data files
created on a users hard drive in response to a command
from a website which allows that website to recognize the
users computer every time they visit and other
technologies that can track customers every activity,
the opportunities for customer profiling on the web are
almost limitless. With so much information, the firm can
seek to drive segmentation to a category of one instead
of dividing clients into, say, three categories based on
lifetime value, companies can aim to personalize and customize
their customer interactions and their marketing pitch to
each individual. And all this can happen as quickly as the
time it takes to click on to the next screen.
Since the Internet also makes it easier for customers
to hop from business to business and shop for the lowest
price, specialists argue that e-businesses have greater
need for data mining. By giving personalized service, firms
aim to gain their clients loyalty. So, a person who
likes receiving tailored book recommendations from Amazon.com
might be less likely to try other sites.
Much of the personalization on the web today uses relatively
simple techniques, according to Stern Business School professor
Foster Provost. One of the ways in which Amazon.com makes
book recommendations is simply by identifying the most frequently
purchased books by customers who also bought the book you
are browsing. But more elaborate programs are evolving and
spreading. Nestor has just launched a product that scores
each click of the mouse on the probability that the behavior
is going to result in a purchase. When different people
enter a web site that uses this technology, they are shown
different offers based on their clicking behavior. A browser-based
tool will make recommendations of what to show the potential
customer based on the score while he or she
is on the site. When you buy milk, there is a probability
that you are going to buy cookies, so we are going to present
you with the option to buy them, says Chartier.
Or, just take a virtual stroll to www.sas.com,
the web site of SAS Institute one of the market leaders
in data-mining software as this author did in the
process of her research for this article. A search for information
on data mining eventually takes you to a page where you
can register to download white papers on different
topics. As you fill in the form, you notice that the category
for data mining (which you just spent the last 10 minutes
browsing) is already selected for you. (Of course, you do
have the option to mark other categories offered on the
page.) Not only that, but the next day when you turn on
your computer, you have received an e-mail from a SAS employee
who says they have noticed your interest and offers you
their contact information in case you have any questions
or wish to place an order.
And this is just the beginning. In the future, systems
may be able to design experiments automatically and get
results on the fly, says Provost. Companies may develop
learning systems that choose a segment of customers on whom
they want to try a new scheme, get the instant results,
learn from their behavior, and improve on the next try,
all on an automated basis. Thus, the program would be running
experiments like those conducted by Harrahs Entertainment
Inc. to determine which offer is going to get the best response
at the lowest cost from specific customers only most
of the process would be automatic.
MINEFIELDS FOR DATA MINERS
As promising as the field may be, data mining is not without
its pitfalls. The quality of the data can make or
break the quality of the data mining, says Jeff Averick.
You can have all the great algorithms and technology
but if you cant rally the data to the cause, the algorithms
can lead you in the wrong direction. Oftentimes, the
data are a proxy for something else that is likely to be
linked to a purchase decision. The address may be associated
with wealth or income, for instance. But, if the data are
not a good proxy or not sufficient (what if you live in
that address as a nanny, butler, or gardener), data mining
can give false results or you can misinterpret them. And
this means that you would not only be wasting your time,
but you may also end up taking counterproductive measures.
In order to mine their information, companies first have
to integrate, extract, transform, and cleanse data to serve
a purpose for which it was never intended. Handling the
massive amounts of data, ensuring accuracy, and integrating
data gathered from all different entry points is a time-intensive
and costly endeavor particularly for old-line companies
with legacy systems from different parts of the business
that have to be made to talk to one another.
Moreover, to get value out of the data mining, companies
must be able to change their mode of operation and maintain
the effort. In the case of supermarket loyalty cards, for
instance, the commitment has to be one that will endure
because of the enormous amount of mailing and chronicling
you have to do, says Bernard Rogan, spokesman for
Shaws Supermarkets, which recently acquired Star Supermarkets
and their Star Advantage loyalty card program. If the company
lets the loyalty program languish, customers might start
wondering why all the information about their purchases
is being collected. The company will then be in a bind,
because loyalty programs are also hard to end. Customers
who have been choosing to fly on a particular airline to
accumulate a given number of miles will not appreciate it
if the program is curtailed or changed before they reap
the rewards.
But perhaps the most important challenge to the spread
of data-mining applications is the growing concern over
privacy. Unease about how private firms acquire and handle
data has been on the rise, particularly since the early
1990s when public uproar forced Equifax and Lotus Development
Corporation to cancel the sale of their Lotus Marketplace:
Households a series of disks containing the names,
addresses, buying habits, and income information of about
120 million Americans.
Companies that use data mining for target marketing are
often walking a tightrope between personalization and respect
for privacy. The actions companies have taken to know their
customers better and use this information have, in some
cases, backfired. In its attempt to start a friends
and family program in the United Kingdom, British
Telecom mailed its customers a five favourite calls
list with the most frequently dialed numbers in each account.
According to the British magazine, The People, this
resulted in a broken marriage when an unsuspecting wife
realized she didnt recognize the most frequently dialed
number from their home. The errant husband told the publication
that he was considering suing BT for having blown the whistle
on his carefully concealed 20-year affair.
Going beyond the potential backlash of the market, privacy
advocates and the Federal Trade Commission (FTC) have been
pushing towards stricter rules such as those applied in
European Union countries. At a minimum, the guidelines proposed
by the FTC state that companies must disclose their information
practices before collecting any personal information and
that consumers should have a choice as to whether and how
personal information may be used. Also, the FTC states that
consumers should be able to view and contest the accuracy
and completeness of the data collected about them.
But the implementation of these guidelines in data mining
is not always straightforward and can be costly to companies.
Firms would have to let customers know that they are using
billing and account information (to name a couple of categories)
for mining purposes. Yet, companies often dont know
specifically what they are going to do with this information
until the data-mining process reveals patterns in the data.
Moreover, providing customers with access to the data in
an intelligible form can be costly and cumbersome. And it
can raise the very privacy concerns it is designed to appease:
How does a company guarantee that the person who requests
to review and correct the information is really the person
whose data was collected? And, if a company guarantees that
it will not share the data with others, what happens to
the data when a company is bought or goes into bankruptcy
and has to sell its assets?
In a sense, technology is outrunning the ability of our
legal system to handle the ethical and property issues that
arise. As privacy expert Jason Catlett sees it, data mining
is pushing the definition of privacy from individuals
claims over determining what information about them is communicated
to others to include determining what information is created
by others.
Also, the technology that renders data useful as a source
of information makes it more valuable as a commodity that
can be sold. Defining which information is personal and
owned exclusively by the individual and which can be owned
by companies as well as the guidelines for what can
be done with the different types of information remains
a challenge for the future.
As the rules are laid out and the technology becomes more
widespread, data mining could have an impact on the efficiency
with which companies cater to the preferences of individual
customers, in the same way that it has been improving the
efficiency with which loans are evaluated, fraud is detected,
and NBA coaches formulate their strategies. Better targeting
can reduce costs for companies and offer customers products
they are more likely to buy, reducing the amount of junk
mail. However, there can still be winners and losers, as
those who turn out to be in the least-profitable segment
for a company will see their options reduced as those
bank customers whose late fees are not forgiven can attest.
And the models are not fail-safe. For all the talk of
prediction, companies cannot impel you to buy their products,
they can only try to pitch their best offer. And in many
ways, this is not so different from the corner grocer of
yore who could greet you by name and tell you that the apples
you like so much are especially juicy and ripe this week.
And for you, just for you, he will cut you a special deal.
KEEPING SCORE WITH DATA MINING
Credit scoring has benefited both banks and consumers by
reducing the time needed to approve loans and the costs
of evaluating them. Using credit bureau data, credit scoring
tries to isolate the effects of various applicant characteristics
on loan delinquencies and defaults. The end result is a
score that allows banks to grade applicants
in terms of risk and determine a threshold below which they
decide it is too risky to lend. Each lending institution
does not have to build its own credit model; companies such
as Fair, Isaac, and Co., Inc. of San Rafael, California,
produce scores that smaller lenders can use.
But, do all customers benefit? Like all criteria used
for evaluations, credit scoring is open to the question
of whether the score gives a fair and accurate reflection
of the creditworthiness of the potential borrower. Some
consider credit scoring a more objective process that helps
ensure that the same underwriting standards are applied
to all borrowers. By law, scores cannot consider a persons
ethnic group, religion, gender, marital status, and national
origin. Fair, Isaac, and Co., Inc. evaluates five main parts
from peoples credit reports: payment history (i.e.,
late payments, bankruptcies), amount owed, length of credit
history, new credit, and types of credit in use in
determining an applicants score.
Nonetheless, minorities have lower credit scores on average
than white applicants. Scoring industry representatives
say that this is because factors that affect a borrowers
ability to meet financial obligations, such as income, property,
education, and employment, are not equally distributed by
race or national origin in the United States.
Still, models can only be as accurate as the underlying
data. Mistakes will affect the results. And how well the
scores assess risk depends on whether history can accurately
predict the future. If a large change occurs that has not
been accounted for say, in the cultural acceptance
of bankruptcy accuracy will drop. Moreover, applicants
with no credit histories are excluded from scoring models.
Thus, increasing reliance by lenders on scoring can create
barriers to credit for these people.
Skeptics argue that in the case of scoring for small business
loans, lending to low- and moderate-income areas may be
limited by scoring as these areas tend to be underrepresented
in the samples used to build the models. But one recent
study by the Atlanta Fed found that after accounting
for community and bank characteristics banks that
used scoring did not lend significantly less in low- or
moderate-income areas than in high-income areas.
Return to article
HOW MUCH ARE YOU WILLING TO PAY?
Collecting and tracking large amounts of customer information
can help companies better serve their clients. Aside from
allowing them to offer a more personalized service, it can
make it easier for businesses to ensure that the right products
are available to the right people at the right time. But
it can also help them determine how much each customer is
willing to pay and charge accordingly.
Charging customers differently is not new. Many businesses
have found ways to charge customers different prices for
essentially the same product. At any point in time, for
instance, a large U.S. airline carrier is likely to have
20 or more fares available on a given route, according to
economists Severin Borenstein and Nancy Rose. (And, this
variety of fares refers only to direct coach class travel
in the largest U.S. service domestic markets.)
In order to do this, companies must find ways to sort
their customers according to their willingness to pay for
the product, and they must seek ways to make customers reveal
their preferences. In the case of the airline industry,
a good example is the Saturday night stay requirement. By
offering lower fares on flights including a Saturday night
stay, airline companies can sort the business travelers
who are more likely to want to be back home by the
weekend and who are willing to pay more in order to suit
their business needs from the vacationers who, given
a high enough price, might decide to drive or not leave
their hometown at all.
Through collecting customer information and applying data
mining, companies can better figure out such preferences.
They can more accurately target discounts or promotions
to particular segments of their client pool effectively
changing the price. Or, they can better figure out ways
in which they can tailor a product, say create a number
of versions designed for each customer segment, in order
to charge different rates for the same underlying good.
In a competitive market, one would expect that prices
for the same good would converge. (The higher fares would
not persist because rival companies could gain market share
by charging lower fares.) Still, one can find different
prices for the same good even on the Internet, where you
might expect a smaller dispersion in prices because comparison
shopping is easier and less costly. MIT professors Erik
Brynjolfsson and Michael D. Smith found that Internet retailer
prices differed by an average of 33 percent for books and
25 percent for compact disks.
Companies clearly gain when they manage to charge customers
according to how much they are willing to pay. If you, as
a customer, get passed over for discounts or get offered
the higher-priced packet, an improved ability on the part
of businesses to charge different prices might not seem
such a great innovation. But all customers dont necessarily
lose. If the ability to extract different prices means that
consumers who might not have had access to the product under
a single price scheme can now buy it, then those consumers
will gain. Moreover, points out Stern Business School professor
Foster Provost, as consumers increasingly realize the value
of their information, companies may find that they have
to start giving greater incentives in exchange for such
information.
Return to article