dinsdag 31 december 2013

A Small Presentation on Big Data

In eight minutes I make the connection between marketing, information management and Big Data to position the real value of it and separate it from the hype.
Click here for the presentation.

Wishing you a great 2014, where you will make decisions based on facts and data be successful in all your endeavours.

Kind regards,


donderdag 19 december 2013

Download our Webinar slides from the ITMPI

We have posted the slides of two webinars on our Lingua Franca site for you to download:

How Business Analysis for Business Intelligence Creates Strategic Value
How to Keep Business Intelligence in Sync with Your Strategic Priorities

Send us your feedback, we look forward to reading your comments.

donderdag 14 november 2013

Business Intelligence has become too big to allow failure.

Four speakers between Ralph Kimball’s sessions, four topics and one unifying thought: BI is getting to big to allow failure. The first Business Analytics for All Insight session which took place in Brussels the 12th November gathered over 250 attendees to hear Ralph Kimball’s insights on the data warehouse design principles and how the Big Data phenomenon fits in this architecture. I gladly refer to the Kimball Group’s website with articles like these for his vision on Big Data.  

But between Ralph’s talks in the morning and in the afternoon, four other topics were discussed which all lead to the same conclusion: BI has become too big, too much of a strategic commitment to allow for sloppy business analysis and project management.
Annelies Aquarius, European BI Project manager from the Coca-Cola Company illustrated the anytime- anything-anywhere aspects of mobile BI. Jelle Defraye from Laco made a case for self service BI.  Jos Van Dongen from SAS taught us the basics from data visualisation and Guy Van der Zande from USG ICT Professionals explained why a well organised BI Competence Center (BICC) is essential to manage technology trends and changing business requirements.

For a full description of their talks we refer to the website.

It is time for proper BI business analysis and project management

Let me explain my point. With the growth of users, user types, data a lot of side effects have come into play since the early days of DSS where you offloaded a few tables to make reports for the CFO.
Exabytes of data flowing in at incredibly high speeds from a myriad of data sources in structured, semi-structured and structured formats need to be exploited by more people in a faster decision making cycle which is not limited to the strategic apex anymore. Thus the feedback loops become more complicated as the one-to-many relationship of top management and the workforce now becomes a many-to-many relationship between more and more decision making actors in the organisation. Self service BI, mobile BI and visualisation are all part of the solution and the problem if your organisation has no duopolistic governance from IT and the business. because both business processes and data management processes need to be mutually adjusted to allow for maximum return on investment . The alternative is chaos. So there you have the true value of  a well working BICC.
But to get there and to stay on that level, only a thorough business analysis process and the proper BI project management method will increase the success rate of business analytics. This success rate worries me. Because after twenty years in this business I am still seeing failure rates of 80% in BI. If we’d had the same rate of improvement in medicine as in BI we would still be using leeches and bleeding our patients regardless of the disease.

dinsdag 22 oktober 2013

Interview with a Business Intelligence User

Let’s call him Eric. Because after the interview Eric decided he’d better remain anonymous. Some of his answers could cause too much controversy in the organization, a major European logistics company.
Eric is BI manager in this company and when listening to his vision, his worries and his concerns, it is like taking stock of the most common disconnects between IT and the business.

Question: What struck you the most when reading the book “Business Analysis for Business Intelligence”?
Eric: I think you have documented your book well and chose a useful starting point. Most literature in Business Intelligence (BI) is divided in two categories.  On one side you have a myriad of theoretical works on strategy and management,  performance management  and the inevitable scorecards and dashboards. On the other side are plenty technical publications available discussing IT performance and optimum data structures. What many of these books lack is a vision of how business and IT should join hands to produce optimum BI results. From my 20 years’ experience with BI, this is a serious problem.

Question: What are the major impediments for your performance as a BI manager?
Eric: I see three roadblocks: IT is either unaware or unwilling to admit that BI cannot be standardized. But the business itself is not always capable of producing crisp and consistent definitions to produce a coherent analytical frameworks changes its mind”. And last but not least: the complexity of some analytics also causes a lot of problems and is –of course- compounded by the two previous roadblocks.

Question: Why would IT not be aware of the need for flexibility? Some IT guys we know say stuff like “The business guys always change their mind”
Eric:  No, it’s not about business changing its mind because that can be prevented through thorough analysis as described in your book. It is more about the prejudice that BI solutions are templates you can use anywhere. IT people underestimate the uniqueness of each business process and its context, culture and informal issues that make every business unique. Management can shift its attention and rearrange its priority list in days and weeks. If IT can’t follow, the users look for ad hoc (and often badly architected) solutions.

Thank you for sharing this with  us, Eric. 

To our readers: don’t hesitate to share your experience with the gap between business and IT in BI. We can all learn from this!

vrijdag 30 augustus 2013

Book Review Retail Analytics, The Secret Weapon by Emmett Cox

by Emmett Cox Wiley and SAS Business Series 2012 

The book cover opens with promising references and from Tom Davenport and all this on just 142 pages of 10 point Times new Roman with double spaced interlines. Wow! This needs further reading I guess. Chapter One gives a very high level introduction to the main process: sales and replenishment and chapter two introduces the ins and outs of retail data management. A diagram of the transaction log files could improve the didactic impact of this chapter and the comparison between denormalised and normalised data is at best superficial. But I guess the readers that, according to the editor are helped with their critical management decisions don’t bother much about technical mumbo jumbo, so let’s move on. Although this inspires me to publish an in depth blog on the pros and cons of these datamodels that might interest senior-level management. Because it is about money, time and quality and there is no one size fits all choice to be made.

Inspiring introduction

Cox sums up a few interesting case studies ranging from trade area modelling, site selection modelling, competitive threat analysis, merchandise mix modelling, marketing effectiveness tracking, brand analysis, clicks and mortar analysis, cross sell analysis to market basket analysis. And he does so in plain English which at least will trigger the interest of senior management. The schema on page 25 gives a quick and clear view on how the analytical domains are part of a chain driving important retail and merchandising decisions. While reading these case studies you are confronted with a very mild and unobtrusive form of product placement with sentences like “The statistical software company SAS has a strong set of utilities within its SAS Enterprise Miner that we used heavily” (p. 33) Yet, the case studies are very credible and well presented, even if they don’t go in depth into the statistical details. I guess the target group of this book will tell guys like me: “Find out how we can apply this method in our stores”. Chapter three pays attention to the specifics of the apparel industry and does so in a didactic manner, illustrating clearly the constraints apparel buyers have to work with: long order lead times, optimum price elasticity management, optimum sales promotion management,… Having worked in sales promotion for a mail order company doing 80% of its sales in fashion I recognised the stressful situations even the best retail analytics won’t attenuate. The fourth chapter spends quite a few pages on explaining what a GIS tool can do and how it works which is a bit redundant in the days of Google and Wikipedia, leaving only a few pages for the application of GIS in retail. For example: the combination of category management and GIS analytics can produce powerful results I am sure the author could illustrate to his readers. It is as if Mr. Cox apologises for this in the fifth chapter opening with “Chapter 4 was an extremely technical illustration of the GIS tool.” Er.. well,… I wasn’t bothered with points, vectors and surfaces of the GIS world, I just wanted to know more about the applications…. 

 A 101 on in store promotion 
 In-store marketing and presentation get a lot more attention with 31 of the 142 pages. Pricing strategies are explained very well as they are the meat of retail analytics. But Cox’ emphasis on convenience shopping hides its counterpart: fun in shopping which also needs tailor made analytical insights to enhance the profitability potential. His reference to DB2’s datablades brought me back to the heydays of Informix, once the leading RDBMS before Oracle displayed a superior sales and marketing strategy. All in all this chapter is a little too narrowly focused on the US situation. Phrases like “Most retail chains are now open 24 hours a day…” (p. 98) will raise eyebrows in Europe and sometimes the author gets lost in details like the description on how floor graphics can be removed “without leaving any residue behind”. I wonder who in the analytical community will bother about this. And when I read stuff like “another form of subliminal messaging” (p. 101) related to clearly detectable things like baking smells I get confused. “Subliminal” means a stimulus the audience is not aware of. And literature and research are not in unison about the effects of subliminal messages in stores. Chapter 6 handles the store operations, an underestimated source of information opportunities to be used; Think of workforce management where you optimise the match between labour cost and service level based on POS data and longitudinal time series. Think of differentiated messages to consumers, HVAC control, intrastore communication, replenishment: Cox explains these operations and the link with retail analytics in a clear and concise way. 

Introduction to loyalty marketing

Last but not at all least, the chapter on loyalty marketing. Since Reichheld and Sasser’s publications on customer loyalty we are aware that the link between excellence, customer satisfaction and loyalty is not always obvious and in many cases it is not even existent! Measuring and managing customer loyalty is a real pain in retail. To me, attracting customers to loyalty programs is not equal to creating customer loyalty. It is just a form of prolonged sales promotion that –if stopped- would cause an immediate drop in sales. True loyalty is quite different. It took me one and a half page to read about it: “(…) as long as the consumers do not just cherry-pick the discounts and rewards offers.” Spot on Emmett! Weeding out the cherry pickers is the true challenge for retail analytics. The author offers a checklist before you begin a loyalty program.

To conclude

This book will certainly deliver value for novices in retail marketing and analytics. If you're a CEO or CMO of a retail chain and wonder whether you should invest heavily in an analytical environment, you are probably too late as the competition will have obliterated your obsolete business.That doesn't mean other C-level executives couldn't benefit from this book.  But to be really relevant for other profiles like a CIO or COO, it should have related data and process management to analytics with diagrams and models. 

dinsdag 20 augustus 2013

A Short Memo for Big Data Sceptics

In an article in the NY Times from 17th August by James Glanz, a few Big Data sceptics are quoted. Here is a literal quote: Robert J. Gordon, a professor of economics at Northwestern University, said comparing Big Data to oil was promotional nonsense. “Gasoline made from oil made possible a transportation revolution as cars replaced horses and as commercial air transportation replaced railroads,” he said. “If anybody thinks that personal data are comparable to real oil and real vehicles, they don’t appreciate the realities of the last century.” I respectfully disagree with the learned scholar: the new oil is a metaphor for how our lives have changed through the use of oil in transportation. Cars and planes have influenced our social lives immensely but why shouldn't Big Data do so in equal or even superior order? Let me name just a few: 
  • Big Data reducing traffic jams (to stick close to the real oil world) 
  • Big Data improving the product-market match to the level of one to one, tailoring product specifications and promotions to individual preferences, 
  • Big Data improving diagnostics and treatments in health care combining the wisdom of millions of health care workers and logged events in diagnostics epidemiologic data, death certificates etc... 
  • Big Data and reduction of energy consumption via the smart grid and Internet of things to automate the match between production and consumption, 
  • Big Data in text mining to catch qualitative information on a quantitative scale improving the positioning of qualitative discriminants in fashion, music, interior decorating etc... and of course... politics Ask the campaign team from 44th President of the United States and they will tell you how Big Data oiled their campaign.
As soon as better tools for structuring and analysing Big Data become available and as soon as visionary analysts are capable of integrating Big Data in regular BI architectures the revolution will grow in breadth and depth. Some authors state that entirely new skills will be needed for this emerging market. If I were to promote training and education I'd say the same. But from where I stand today I think the existing technological skills in database and file management may need a little tweaking say a three or five day course but no way is there a need for an MBD (Master in Big Data) education. On the business side of things there may some need for explaining the works of semi structured and unstructured data and their V's which already add up to seven. I believe it is going in the same direction as the marketing P's where Kottler's initial four P's were upgraded to over thirty as one professor of marketing churned out this intellectual athletic performance. Let's sum them up and see if someone can top them: Volume: a relative notion as processing and storage capabilities increase over time Velocity: ibid. Variety: also a relative notion as EBCDIC, ASCII, UTF-8 etc... are now in the company of video and speech thanks to companies like Lernhout and Hauspie whatever the courts may have decided on their Language Development Companies, Volatility: I have added this one in an article you can find on the booksite on "Business Analysis for Business Intelligence" because what is true today may not be true tomorrow, so it is not about the time horizon you need to store the data as some authors claim because that will be defined by the seasonality. The problem with these data is there might not be any seasonality in them! Veracity: how meaningful are the data for the problem or opportunity at hand? Validity: meaning Big Data can only be useful if validated by a domain expert who can identify its usefulness. Value: what can we invest in recording, storing and analysing Big Data in return for what business value? This is one of the toughest questions today as many innovative organisations follow the Nike principle: "Just do it". And that, professor Gordon is just what all the pioneers did when they introduced the car and the aeroplane to their society, ignoring the anxious remarks from horse breeders and railroad companies. Remember how the first cars where slower than trains and horses? I rest my case.

vrijdag 26 juli 2013

Time to muse over “holiday”

Greetings from a 30°C office, in the north of Belgium, Flanders. It’s late in the evening, a good time for some lighter philosophising before packing the suitcase.
The DATE dimension is a very simple flat table (although some designers and DBA’s prefer third normal form). It contains surrogate keys, the date, the name of the weekday, month, day number, week number, month number, quarter, trimester and semester number, IsLastDayOfMonthFlag and what have you. There will also be a IsAHolidayFlag somewhere and in case you only work in a few countries, you simply add a column per country.
In case you operate on a global scale  you may want to snowflake a bit to support truly global analytics and compare apples with apples.

Example of a snowflaked Holiday model

Holidays can explain a lot

Imagine you are comparing last year’s June sales with June of this year. Last year, June had no holidays and 21 working days. This year it has 20 working days and one holiday, i.e. 19 working days. In case your revenue is directly linked to the number of working days, this explains for a revenue drop of no less than ten percent in June this year! And in case you sell holiday related products like ice cream or beer it may well run into the other direction explaining a sales increase of over twenty percent or more…
So, here’s my advice: enjoy your holiday but don’t forget to integrate it in your analysis.

vrijdag 19 juli 2013

When will transaction systems and analytical appliances converge?

 There is no harm in being a bit visionary sometimes…

For the last three or four decades, the gap between online transaction processing (OLTP) and business intelligence(BI) was impossible to cross. Neither technology nor the architecture of both systems allowed integration on data level. So cumbersome and expensive technical infrastructure was needed to extract, transform and load (ETL) data into data warehouses and data marts to exploit the information assets, hidden in the records of the OLP systems.
One of the main problems was that OLTP applications were never conceived with a view on BI. For example, the first Customer Relationship Management (CRM) Applications stored the customer address  as an attribute of the CUSTOMER entity instead of treating it as an entity in itself. When the first geographical information systems and geographical analytical systems came along, CRM developers remodelled the customer database and made ADDRESS a separate entity.


… a greenfield situation where you could develop any OLTP application from scratch and with an added BI perspective. What would it look like? What would be the guiding design principles to develop an OLTP application that plugs in seamlessly into a BI infrastructure.
Because, don’t get me wrong, there will still be a need for a separate and dedicated BI infrastructure as well as a specific BI architecture. But the ETL , the master data management (MDM) and data quality (DQ) management should no longer be a pain.

What would it take to relieve us from ETL, MDM and DQ chores?

Or how the simplest things are the hardest to realise. It would take:
  • A canonical target data model and subsequently,
  • A function in any OLTP application to drop off its data in that target format for the BI infrastructure to pick it up, be it via an enterprise service bus or via a bulk load procedure,
  • A hub and spoke system where the master data objects are managed and replicated in both the OLTP and the BI applications,
  • A uniform data quality policy, procedure and checking of accuracy, consistency, conformity and being in line with business rules of lower order (e.g. “a PARTY > PHYSICAL PERSON must have a birth date older than today”), but also of a higher order (e.g. a CUSTOMER who is under age should have a parent in the customer database whose co-signature must be on the ORDER form”)

So why is this not happening?

One explanation could be that application vendors claim they have full BI functionality built in in their solution. I remember a client with an ERP solution that contained over 300 standard analytics and reports of which the client used less than 3 %, eight to be exact. Not a great BI achievement I guess, but certainly an attempt to have 100% account control...

Another explanation could be the that we are still far away from a canonical model integrating transaction processing with BI, although data warehouse appliance vendors will tell you otherwise. Nevertheless, the dream has been kept alive for decades. I remember a data warehouse guru from the nineties who developed what he called verticals for BI to be used in telco, retail and finance, among others. They were just target models in the third normal form to set up a corporate data warehouse, so no OLTP included. He managed to sell them to a database vendor, bought a nice sailing boat (or rather a yacht) and disappeared from the BI stage. The database vendor peddled these models for a few years but in the years after 1999 I never came across these templates. And the vendor never mentioned their existence since then. Conclusion: we're still far away from a one-size-fits all transaction and analytics model.

What is the second best choice?

I can come up with a few ideas that will still be hard to realise.  Think of an industry wide data type standard for transactions and BI purposes with the transformation rules documented. For example: a timestamp in a transaction database should always be in the form of 'YYYY-MM-DD HH:MM:SS' and two numbers are added for BI purposes: the standard day number, counting from an internationally accepted date like 1900-01-01 and the second number, which is a figure between 0 and 86.400 to represent the lowest grain of the time dimension.
The next step is to design industry dependent star schemas for basic analytics every organisation needs in that industry. There is after all, a growing body of knowledge in retail analytics, telco, finance, production, supply chain etc… If we can already achieve that, I will probably see my retirement date coming which is 2033-01-04 00:00:00.

woensdag 26 juni 2013

Managing Choice, a BI and CRM Challenge


When I was visiting countries behind the iron curtain in the period before November 1989, it struck me how simple life was there. You had the choice between two cheese types in the supermarket (if they were available) one pickled gherkin flavour,… The car market showed a little bit more choice: two Trabant types, a few Wartburgs, Skodas, Ladas, Tatras, FSO, Dacia and other Yugos… They all shared the same features: low quality and no evolution in safety, design or luxury…
Back to 2013: hundreds of cheese types in the better supermarkets and car maker Volvo alone has the capability of building over 5,000 product configurations of its mid-market model. Today, managing choice has become a shared skill, shared between producer and  retailer on one hand and consumer on the other hand.

About Choice Stress

Choice stress is a common phenomenon in developed markets because the differentiation becomes so low in granularity that we get stressed because we want both: a bio low fat yoghurt with strawberry flavour in a reusable cup with the chance to win a trip to Disneyworld but also another bio low fat yoghurt with strawberry flavour in a recyclable plastic cup with a cash back promotion. And then you ask yourself: “But where’s the pineapple variety?”
More and more, Customer Relationship Management becomes the art of dialogue with your customers to help them make the right choice in a stressless environment.
Retailers know that consumers’ main sources of stress are store related (like staff, queues, parking, products sold out, messy presentation, regular changes in the aisles,…) and choice related (mainly brand clutter and information clutter). But what are they doing about it and how does this relate to shopper marketing in the store and online?

One side of the coin: Business Intelligence in the virtual and the real world converges

From needs, occasions and solutions, how do you make the transition to the most profitable brand on your shelves?
And how do you make sure both showroomers and webroomers end up on the right web page or in the right aisle?
Business Intelligence solutions for retailers need to converge both web clicks and store visits per customer to come up with answers to these questions.
Let’s examine the enablers for these advanced analytics.

First there is an organisational aspect: make sure there are no splits in your hierarchy between online and store marketing management. Phew, that’s going to be a hard one for some organisations. You may be enthusiastic about the internal turf wars but your customer doesn’t make the distinction between your click and mortar presence, so why should you?

Second: the balance of power is shifting, so how do you adapt? In the pre Internet era when information was in the hands of producers and retailers the consumer was subjected to  their agenda. Now it is the other way around. Consumers create their own information about products and brands and managing this flow of dispersed blips on the radar is quite different from the traditional broadcast, one-way marketing communication. The consumer’s knowledge on product ranges of his choice is sometimes better than the shop assistant’s. Social media may not be a good vehicle to promote any brand but they sure are effective vehicles to break down reputations… fast and irreversible. That is not to say that there aren’t brands and retailers effectively using social media to manage sentiments and content about their products and brands. But they are still a minority, which is the only positive message I have for the laggards: there is still time to catch up. But don’t wait too long. Initiatives like Amazon Birthday Gifts using Facebook to have friends chip in for a birthday gift card are just the beginning of a set of ploys coming along to digitise classical real world interactions and channel them to the retailer who has the creativity and excellence of execution to take the first mover advantage. Big behavioural data[i] will become more and more a topic on the retailer’s agenda but this is only one side of the coin. The other side is a new form of customer relationship management (CRM) where the social aspect is altering the classical CRM processes.
The other side of the coin: social CRM
Paul Greenberg, a recognised CRM expert for decades, cornered the term in a handsome and useful definition:
"CRM is a philosophy and a business strategy, supported by a technology platform, business rules, workflow, processes and social characteristics, designed to engage the customer in a collaborative conversation in order to provide mutually beneficial value in a trusted & transparent business environment. It's the company's response to the customer's ownership of the conversation."
I’ll add my two cents to that definition: it is an extension of the existing CRM process support in that sense that it interacts sooner with the customer in the sales funnel, trying to convert information seekers and information producers into consumers.
Technology vendors like Salesforce.com and Sugar CRM have been working hard to produce support for social CRM and others are following their lead. Social CRM is about a meaningful dialogue as researchers of Penn State, Duke and Tilburg University found out.
Establish a meaningful dialogue with your customers
In their article “SOURCES OF CONSUMERS’ STRESS AND THEIR COPING STRATEGIES” (European Advances in Consumer Research Volume 4, 1999, Pages 182-187, by Mita Sujan, Harish Sujan, James R. Bettman and Theo M.M. Verhallen) talk about facilitating choice both online and in store, only five years after the Internet became available for commercial purposes:

Marketer Interventions for Consumers Stress and Coping.

As suggested earlier, marketers can help consumers cope with their stresses by enabling them to use more effective strategies for coping. For example, retail stores can provide more in-store personnel that stressed consumers can approach for help. Additionally, marketers can facilitate the development of consumer self-efficacy through the environments they create. One way to achieve this may be through consumer educational programs (at the point of purchase, over the web) that teach consumers skills by which to make better buying choices, use products more appropriately and to dispose them more responsibly
Conclusion: retailers become information brokers, in collaboration with producers.
Managing information online and in store and presenting this information in a timely and accurate manner will help the shopper cope with choice stress. Convenience shoppers will greatly appreciate this approach and are ready to pay a premium price for this service. Our self-service economy has become so time consuming that consumers with spending power have become more than ever aware of the time = money equation.
Combining data from producers about product perception and experience with shared information between producers and retailers about product preferences and what I call “the shopping logistics of product choice”.
If you want to know how this is done, don’t hesitate to contact us at contact@linguafrancaconsulting.eu    

[i] I refer to my definition of the term in the article “What is Really “Big” about Big Data” which you can find here.  

dinsdag 18 juni 2013

Gurus of BI in Oslo

Sound and vision in the country of Grieg and Munch

The tenth June, Oslo Spektrum was packed with 400 attendees for the second Gurus of BI conference. Yours truly made a small contribution. My presentation on BI and workforce management can be found here.
I guess both Edvards would not have been impressed by the quality of the sound and vision but at least the PowerPoint slides are self-explanatory.

donderdag 6 juni 2013

Don't you love KD Nuggets?

I know, it's just a pop poll, but if the big commercial and self-proclaimed market leaders in statistical analysis can't incite their users to vote for them, then I consider this poll as interesting information.
Click on the KDNuggets link for the full story.

woensdag 5 juni 2013


Or in a clearer expression: Business Intelligence Software as a Service, What is The Future? For the last five years, Business Intelligence as a Service (BIaaS) has been promoted but until today hasn’t acquired a large chunk of the market. Yet, if you believe the promises made by the vendors and service providers, it should be a no brainer: “Better! Faster! Cheaper!” all over again.

This inspired me to launch a quick poll extending the question to “outsourcing of BI” which is a more general perspective: from infrastructure, via analysis and solution design, over maintenance to the entire BI system. And this, not necessarily in the cloud, because the cloud is just another version of distancing your applications from your core operations.  The written responses of the respondents provide some extra clues why outsourced BI remains mainly restricted to infrastructure and not the data and the analytics.

On the other hand, it is not easy to describe and measure the outsourced BI market in its entirety. Is the BI functionality in cloud based software like Open Bravo, Salesforce.com and others part of this category? Some will include them, I don’t. Because the essence of BI is that it delivers cross-application data and insights.

Here are the results of the poll in a tabular form:

How far can outsourcing of Business Intelligence go?
Relative (x/194 * 100%)
I outsource the entire system
Only the IT infrastructure
Only Business Analysis  and Solution Design
Only the maintenance

Table 1: results of the poll in absolute and relative figures.

Let us open the debate…

…with the staunchest adversaries of outsourcing

Donna Hutcheson:  A business should never outsource anything that is core to the business: its security, its data, or the strategic direction and controls.

Donna is supported by Marleen de Frenne, BI manager at bpost, the number one Belgian logistics and postal services organisation.

Marleen De Frenne:  I totally agree with Donna. Only IT infrastructure. I need to have my developers very closely to the business, otherwise the development would never reflect the (ever changing) business reality.

Jamie Castille  In speaking with other PMs, developers and IT managers about this very subject, I've found that the common feedback is that the cost is not always beneficial. It's an issue of what gets Lost in Translation. The communication barrier causes the project to get extended, which nullifies the so called cost savings.

Jack Whittaker  Outsourcing BI poses issues of security, which some may find difficult to overcome. You may feel that this is paranoia - but there is no point in spending millions on data security and then shipping the data off to "the cloud" wherever that may be.

Ivan Van den Bosch  I believe specialists are better have a huge added value to setup a adequate BI for a Business by closely working together with the business management. Of course the infrastructure can be outsourced as long as confidentiality is secured.

Mark Notschaele  BI (outsourcing) in the e-payment industry is rather sensitive and subject to an abundance of regulatory issues.

J.C. Software developer:  Outsourcing is something that at first you hope "they" will do ok, then shortly after fear that "they "won't, and usually regret that "they" didn't. Thus begins the recovery.

Fig 1. The poll results in a graph


What I conclude from these remarks is also based on my own experience: BI is about creating and using context with metrics. It is also about making sense from analyses, reports and other BI products. Therefore, analysts and developers should be kept close to the business.
The infrastructure can be outsourced and you can choose from two options: put the storage and processing power in the cloud or keep the data on your servers and outsource the processing to grid computing providers.
The reluctance to outsource the entire system is understandable but if the provider can guarantee flexibility, security and keeps your migration options open, then I can imagine BIaas as a viable option, especially for smaller companies or for larger ones who would like to play around in a sandbox and discover new techniques, modelling patterns and analytical insights. But do make sure a seamless migration to another provider or to on-premise infrastructure remains possible. Have regular audits from independent BI consultants like Lingua Franca to assess this.

.. but there are also defenders of outsourcing

Sreenivas Jayaraja, PMP  Outsourcing is a concept and concepts can be applied every part of your day to day activities. Outsourcing is just beginning and it will blossom with time. Today I do not think without outsourcing any model can sustain for a longer term. In house development is becoming more costlier by the day and Organisations are looking for a sustainable model to beat 1)Market Shift 2)Competition and most importantly the 3)The Technology Advancement. And to achieve this Outsourcing is necessary. Outsourcing does not merely mean push the job to outside world but a more meaningful of getting things done, efficiently and effectively.

Joeri Van Bogaert  In a world where powerful and knowledgeable specialists and specialized services are available I choose to outsource these activities in order to focus on the company’s core activities.


If the promising vision of Mr. Jayaraja were true, I would be an enthusiastic outsourcer. Some experience with foreign outsourcing partners on various BI projects in Europe has learnt me this is only the case if you have your people managing their people. This means mixing expensive expats with low cost, out of context technicians you need to explain everything to the slightest detail or things go south and outsourcing costs you a multiple. I remember a customer needing one local guy to create a report in five days. “Could be better, faster and cheaper” he thought. He quickly outsourced it to a large Indian IT services provider who needed six engineers for three weeks to come up with the reports. Needless to add that during these three weeks, the client’s people were on the phone, chat or mail all the time to provide basic information and local context to come up with meaningful results.

Outsourcing analysis and development in BI is the last thing I would recommend. Hardware, processing, storage, data manipulation, … anything that can run within a clearly defined out-of-context setting can be outsourced. But keep the context close to you because this is one of your last competitive advantages that can’t be, shouldn’t be outsourced.

... let’s look for some nuanced opinions

Bill Genovese  As per what is being discussed here, I agree, and it largely depends on customer local and regulatory requirements + their business requirements. The most prevalent SO model I've seen and where I have been engaged is typically only the infrastructure. However, if BI Tool (i.e Standard Reporting Applications from an End-User perspective) in this definition, then customers are also starting to outsource these applications more from a design, installation, development and support perspective. The line of delineation I've seen starts with the level of analytics required, in terms of the application analytical engines, DWH, data models and calculations required that are close to the heart of the customer's business, and the complexity of the application and information architecture design and development that is known most intimately by the business . These typically stay within the customer organization--customers of course bring in consultants and architects on an engagement basis, but rarely have I seen E2E outsourcing of the entire BI, Analytics, Data Integration, Metadata Management, and DWH stack (including security, and governance).

Dan Linstedt  Outsourcing the data lends itself to privacy issues. Outsourcing the people / team has been going on for years. Outsourcing the maintenance has been done by hiring vendor based consultants (those that know the chosen vendors' solution). In my opinion, the outsourcing components will be utilized because of cost per terabyte - but only in cases where the data set will not disclose personal information, or a breach of security won't yield any personal information. Things like Amazon RedShift are a game changer in this regard - but only if they can show that it's a protected environment. Cheers, Dan Lhttp://www.LearnDataVault.com

Saurabh Dwivedy  I firmly believe that only those activities should be outsourced which can be performed more competitively by a specialist. One should retain focus only on core business areas that are fundamental to the value proposition of the firm. The rest can be outsourced. In this regard, your BI system's overall design, implementation and maintenance being critical to the value proposition being offered should ideally be retained by the firm.

Dr. Henk Dijkmans  You can outsource the complete BI activities, but the outsource partner has to be integrated and respect the brand and business strategy of the your company.

Michelle Mueller: Information Technology is not my area of direct expertise, however as a Financial Analyst I have worked on several financial systems that have required a level of competence. My choice was "Other" because I think that outsourcing of a Business Intelligence System can go as far as the needs and expectations of its user/Stakeholder. The risk in outsourcing the management and/or Analysis of a company's 'Internal Intelligence' is that by doing so it places the third party at a vantage point while simultaneously runs the risk of alienating internal stake holders - therefore the third party Manager remains gatekeeper. Alternatively, a mixed approach that on the one hand leverages external expertise while integrating that level within your company promotes an interdependent strategy in addition to a value-add that may be more in line with the marketplace. Bottom line, data security should be a priority and carefully thought through if a company wants to remain competitive from both a short and long term perspective. (Toronto, Ontario, Canada)
Nothing more to add to these comments.


There is a market for outsourced BI as long as it remains a peripheral option: only the IT infrastructure and only  the entire solution if we have no other viable options. Because we are a small company or we are business users who can’t get past IT governance boards blocking any new initiative. BI people like to explore new possibilities so in the long run, when all known issues with flexibility, security and vendor lock-in will be solved, BIaas has a future. But for now, it requires a restrictive vision on outsourcing of our principal strategic asset: information for decision support and improvement of efficiency and effectiveness.

zaterdag 1 juni 2013

Book Review: Taming the Big Data Tidal Wave

By Bill Franks
The Wiley and SAS Business Series 2012

The Big Data Definition misses a “V”

Whenever I see a sponsored book, the little bird on my shoulder called Paranoia whispers “Don’t waste your time on airport literature”. But this time, I was rewarded for my stamina. As soon as I got through the first pages stuffed with hype and “do or die” messages the author started to bring nuanced information about Big Data.

I appreciate expressions of caution and reserve towards Big Data: most Big Data doesn’t matter (p.17) and The complexity of the rules and the magnitude of the data being removed or kept at each stage will vary by data source and by business problem. The load processes and filters that are put on top of big data are absolutely critical (p. 21).

They prove Franks knows his onions. Peeling away further in the first chapter, his ideas on the need for some form of standardisation are spot on.

But I still miss a clear and concise definition of what really distinguishes Big Data as the Gartner definition Franks applies (Velocity, Volume and Variety) misses the “V” from “Volatility”. A statistician like Franks should have made some reflections on this aspect. Because “Variety” and “Volatility” are the true defining aspects of Big Data.

Moving on to chapter two where Franks positions Web data as the original Big Data.

It’s about qualitative contexts, not just lots of text strings

It is true that web analytics can provide leading indicators for transactions further down the sales pipeline but relying on just web logs without the context may deliver a lot of noise in your analysis. Here again, Franks is getting too excited to be credible, for two reasons: you are analysing the behaviour of a PC in case of non-registered customers and even when you know the PC user, you are missing loads of qualitative information to interpret the clicks. Studies with eye cameras analysing promotions and advertising have shown that you can optimise the layout and the graphics using the eye movements combined with qualitative interviews but there is no direct link between “eyeballs and sales”. Companies like Metrix Lab who work with carefully selected customer panels also provide clickstream and qualitative analytics but to my knowledge using these results as a leading indicator for sales still remains very tricky. Captions like Read your customers’ minds (p.37) are nice for Hello magazine but are a bit over the top.

I get Big Data analytical suggestions from a well-known on  line book store suggesting me to buy a Bert doll from Sesame Street because my first name… is… you guessed? Imagine the effort and money spent to come up with this nonsense.

The airline example (p. 38-39) Franks uses is a little more complicated than shown in the book: ex post analysis may be able to explain the trade-offs between price and value the customer has made but this ignores the auction mechanisms airlines use whenever somebody is looking and booking. Only by using control groups visiting the booking site with fixed prices and compare them to the dynamic pricing group may provide reliable information.

Simple tests of price, product, promotion etc. are ideal with this Big Data approach. But don’t expect explanations from web logs. The chapter finishes with some realistic promises in attrition and response management as well as segmentation and assessing advertising results. But it is the note at the end that explains a lot: The content of this chapter is based on a conference talk… (p. 51)

Chapter three suggests the added value of various Big Data sources. Telematics, text, time and location, smart grid, RFID, sensor, telemetry and social network data are all known examples but they are discussed in a moderate tone this time. The only surprise I got was the application of RFID data in casino chips. But then it has been a while since I visited COMDEX in Vegas.

Moving on to the second part about technologies, processes and methods. It starts with a high level didactic “for Dummies” kind of overview of data warehouses, massive parallel processing systems, SQL and UDF,PMML, cloud computing, grid computing, MapReduce.

In chapter 5, the analytic sandbox is positioned as  a major driver of analytic value and rightly so. Franks addresses some architectural issues with the question of external or internal sandboxes but he is a bit unclear about when to use one or the other as he simply states the advantages and disadvantages of both choices, adding the hybrid version as simply the sum of the external and internal sandbox(p. 125 – 130).

Why and when we choose one of the options isn’t mentioned. Think of fast exploration of small data sets in an external system versus testing, modifying a model with larger data sets in an internal system for example.

When discussing the use of enterprise reusable datasets, the author does tackle the “When?” question. It seems this section has somewhat of a SAS flavour. I have witnessed a few “puppy dog” approaches of the SAS sales teams to recognise a phrase like: There is no reason why business intelligence and reporting environments; as well as their users, cant leverage the EADS (Enterprise Analytic Data Set (author’s note)) structures as well (p145). This where the EADS becomes a substitute for the existing –or TO BE-  data warehouse environment and SAS takes over the entire BI landscape. Thanks but no thanks, I prefer a best of breed approach to ETL, database technology and publication of analytical results instead of the camel’s nose. A sandbox should be a project based environment, not a persistent BI infrastructure. You can’t have your cake and eat it.

The sixth chapter discusses the evolution of analytic tools and methods and here Franks is way out of line as far as I am concerned. Many of the commonly used analytical tools and modelling approaches have been in use for many years. Some, such as linear regression or decision trees, are effective and relevant, but relatively simplistic to implement? (p. 154) I am afraid I am lost here. Does Franks mean that only complex implementations produce value in big data? Or does he mean that the old formulas are no longer appropriate? Newsflash for all statisticians, nerds and number crunching geeks: better a simple model that is understood and used by the people who execute the strategy than a complex model –running the risk of overfitting and modelling white noise- that is not understood by the people who produce and consume strategic inputs and outputs… Double blind tests between classical regression techniques and fancy new algorithms have often showed only slightly or even negative added predictive value. Because models can only survive if the business user adds context, deep knowledge and wisdom to the model.

I remember a shootout in a proof of concept between the two major data mining tools (guess who was on that shortlist!) and the existing Excel 2007 forecasting implementation. I regret to say to the data mining tool vendors that Excel won. Luckily a few pages further the author himself admits: Sometimes “good enough” really is! (p. 157)

The third part, about the people and approaches starts off on the wrong foot: A reporting environment, as we will define it here, is also often called a business intelligence (BI) environment.

Maybe Franks keeps some reserve using “is also often called” but nevertheless is reversing a few things which I am glad to restore in their glory. Business Intelligence is a comprehensive discipline. It entails the architecture of the information delivery system, the data management, the delivery processes and its products like reporting, OLAP cubes, monitoring, statistics and analytics…

But he does make a point when het states that massive amounts of reports … amount to frustrated IT providers and frustrated report users. Frank’s plea for relevant reports (p. 182) is not addressing the root cause.

That root cause is –in my humble opinion- that many organisations still use an end to end approach in reporting: building point solutions from data source to target BI report. That way, duplicates and missed information opportunities are combined because these organisations lack an architectural vision.

On page 183, Bill Franks makes a somewhat academic comparison between reporting and analysis which raises questions (and eyebrows).

Here’s the table with just one of the many comments I can make per comparison:

Just one remark (as you are pressed for time)
Provides data
Provides answers
So there are no data in analyses?
Provides what is asked for
Provides what is needed
A report can answer both open and closed questions: deviations from the norm as  answers to these questions and trend comparisons of various KPI’s leading to new analysis.
Is typically standardised
Is typically customised
OK, but don’t underestimate the number of reports with ten or more prompts: reports or analytics? I don’t care.
Does not involve a person
Involves a person
True for automated scoring in OLTP applications but I prefer some added human intelligence as the ultimate goal of BI is: improved decision making.
Is fairly inflexible
Is extremely flexible.
OK Bill, this one’s on me. You’re absolutely right!

The book presents a reflection on what makes a great analytic professional and how to enable analytic innovation. What makes a great analytic professional and a team? In a nutshell it is very simple: the person who has the competence, commitment and creativity to produce new insights. He accepts imperfect base data, is business savvy and connects the analysis to the granularity of the decision. He also knows how to communicate analytic results. So far so good. As for the analytic team discussion, I see a few discussion points, especially the suggested dichotomy between IT and analytics (pp. 245 – 247) It appears that the IT teams want to control and domesticate the creativity of the analytics team but that is a bit biased. In my experience, analysts who can explain not only what they are doing and how they work but also what the value is for the organisation can create buy in from IT.

Finally, Franks discusses the analytics culture. And this is again a call to action for innovation and introduction of Big Data Analytics. The author sums up the  barriers for innovation which I assume should be known to his audience.


Although not completely detached from commercial interests (the book is sold for a song, which says something about the intentions of the writer and the sponsors) Bill Franks gives a good C-level explanation of what Big Data is all about. It provides food for thought for executives who want to position the various aspects of Big Data in their organisation. Sure, it follows the AIDA structure of a sales call, but Bill Franks does it with a clear pen, style and elegance.

This book has a reason of existence. Make sure you get a free copy from your SAS Institute or Teradata account manager.