New Trends in Predictive Analytics

Information Builders
March 28, 2011

I just came back from the Predictive Analytics World conference held in San Francisco, which was attended by 500 people – a significant increase in attendees to a conference that typically draws about 200 guests. I go to conferences to find about new developments and to feel the pulse of the consumer on both past and future developments.  I’m less interested in hearing about emerging trends such as the development of new complex algorithms that improve the accuracy of our statistical inferences, as I find these discussions to be too complex and esoteric. However, in my experience, this is what data mining conferences are about – complex and esoteric stuff. That’s part of why I found this year’s Predictive Analytics World to be such an interesting show.

 This time there was no mention of esoteric new trends in the keynote address by Eric Seigel.  The discussion was about 101 use cases but restated in business terms.  The complex math was replaced by ROI discussion; Lift charts were converted to simple percent gains charts familiar to business people; etc.  Instead of feeling the loss of the esoteric content, the attendees were thrilled to learn about the actual applications and their impact on business. There was a sense of relief among attendees that this year’s Predictive Analytics World offered technical experts a way to communicate the usefulness of their activity to other professionals.  All in all, the conference was an initial attempt to establish a common business lingo that allows users to understand and integrate useful statistics into the decision making process.

For me, the key take away was: “Do not torture the audience with esoteric stuff.  Instead translate it into business terms.”  This statement is a return to how economists were advised to write at the beginning of the century when, unless the discoveries and policy recommendations of the new nascent science were intuitively understood by politicians, business people and the affected masses, the discovery was sure to fade into obscurity. Hence, in the early days economist Alfred Marshal advised his students to use math formulas to clarify and verify their ideas but then to write their conclusions in simple and understandable language.  This advice is particularly relevant in our industry if we want the interest in predictive modeling to continue to grow. Unless what we do is well understood, it will not be applied. So I hope that this trend is here to stay.

The keynote speaker emphasized five applied areas in which the benefits of predictive modeling are well known, the methods are well established, the use cases are clear, and, thus, it is only a matter of execution.  

  • Churn and response modeling for sales and marketing operations: For those not versed in the marketing speak churn is about whether a customer will leave and response is about whether a customer will buy.  Thus, why talk about a logistic regression, i.e., the method of modeling churn and response that no one understands, when the business users are only interested in the answers to those two fundamental operational questions.
  •  Fraud is widely spread across industries. The cases range from credit card activations, invoices, tax returns, online activities, insurance claims, telecom call activities, checks and clicks on paid ads. So much debated in this field about formulas and techniques that practitioners often forget that the business sponsors of fraud detection are only interested in two simple benefits: (1) not to let thieves get away with it, and (2) reduce the stuff needed to monitor fraud.
  • Similar to fraud, the discussion of predictive modeling in insurance is littered with formulas. And hence the reaction of business sponsors is frequently varied -- ranging from admiration for the esoteric to complete rejection.  But the goals are simple and practitioners expect to be able to reduce the loss ratio by avoiding paying unexpectedly high claims. 
  • Credit card operations also have a simple business problem to solve: getting paid back.
  • And while the systematic applications of predictive modeling in healthcare are relatively new, the fundamental applications are similar to those in the other four areas. It is all about customer risk and in healthcare this is the risk of readmission, which can be reduced by identifying high risk patients and monitoring them.

It’s evident that the problems across all areas are fundamentally the same – customer and operational risk. And like in any mature science rather than solving new problems the practitioners have focused on refining the solutions to the existing problems.  In predictive modeling that has led to the development of more and more esoteric formulas.  Not that this is bad. What is bad is that the discussion of the formulas has taken over the discussion of the business problem. By doing this, we inadvertently have alienated business users and have missed an opportunity to attract more people, build analytic culture, focus on process improvements and embed advanced analytics into applications for operational users. I am glad that we are beginning to re-focus our efforts in this direction in order to make predictive analytics more pervasive.