It has been a while since I published
anything on this blog. But after having been confronted with organisations that
–from an analytics point of view- live in the pre-industrial era, I need to get
a few things off my chest.
In these organisations (and they aren’t the
smallest ones) ends and means are mixed
up, and ends are positioned as the beginning of Business Intelligence. Let me
explain the situation.
Ends are the beginning
|
A metaphor for a critical look at reporting
requirements is like watching heavy drift ice
and wondering whether it’s coming
from a land based glacier or from an iceberg...
|
Business users formulate their requirements
in terms of reports. That’s OK, as long as someone, an analyst, an architect or
even a data modeller understands this is not the end of the matter, on the
contrary.
Yet too many information silos have been
created when this rule is ignored. If an organisation considers report
requirements as the start of a BI project they are skipping at least the
following questions and the steps needed to produce a meaningful analytics
landscape that can stand the test of time:
- New information silos emerge with an
end-to-end infrastructure to answer a few specific business questions leaving
opportunities for a richer information centre unexplored.
- The cost per report becomes prohibitive.
Unless you think € 60.000 to create one (1) report is a cinch…
- Since the same data elements run the risk
of being used in various data base schemas, the extract and load processes pay
a daily price in terms of performance and processing cost.
Ends and means are mixed up
A report is the result of an analytical
process, combining data for activities like variance analysis, trend analysis,
optimisation exercises, etc.. As such it is a means to support decision making;
so rather than accepting the report requirements as such, some reverse engineering is
advised:
What
are the decisions to be made for the various managerial levels, based on these
report requirements?
You
may wonder why this obvious question needs to be asked but be advised,
some reports are the equivalent of a news report. The requestor might just want
to know about what happens without ever drawing any conclusions let alone linking
any consequences to the data presented.
What
are the control points needed by the controller to verify aspects of the
operations and their link to financial results?
Asking this question almost always leads to
extending the scope of the requirements. Controllers like to match data from
various sources to make sure the financial reports reflect the actual
situation.
What
are the future options, potential requirements and / or possibilities of the
required enhanced with the available data in the sources?
This exercise is needed to discover
analytical opportunities which may not be taken at the moment for a number of
reasons like: insufficient historical data, lacking analytical skills to come
up with meaningful results… But that must not stop the design from taking the
data in scope from the start. Adding the data in a later stage will come at a
far greater cost than the cost of the scope extension.
What
is the basic information infrastructure to facilitate the above? I.e. what is
the target model?
|
A Star schema is the ideal communication platform between business and tech people. |
Whatever modelling language you use, whatever
technology you use (virtualisation, in memory analytics, appliances, etc…) in
the end the front end tool will build a star schema. So take the time to build
a logical data star schema model that
can be understood by both technical people and business managers.
What
is the latency and the history needed per decision making horizon?
The latency question deals with a multitude
of aspects and can take you to places you weren’t expecting when you were briefed
about report requirements. As a project manager I’d advise you to handle with
care as the scope may become unmanageable. Stuff like (near) real-time
analytics, in database analytics, triple store extensions to the data
warehouse, complex event processing mixing textual information with numerical measures…
But as an analyst I’d advise you to be aware of the potentially new horizons to
explore.
The history question is more
straightforward and deals with the scope of the initial load. The slower the
business cycle, the more history you need to load to come up with useful data
sets for time series analysis.
What
data do we present via which interface to support these various decision types?
This question begs a separate article but
for now, a few examples should make things clear.
Static reports for external stakeholders
who require information for legal purposes,
- Reports using prompts and filters for team
leaders who need to explore the data within predetermined boundaries,
- OLAP cubes for managers who want to explore
the data in detail and get new insights,
- A dashboard for C- level executives who
want the right cockpit information to run the business,
- Data exploration results from data mining
efforts to produce valid, new and potentially useful insights in running the
business.
If all these questions are answered
adequately, we can start the data requirements collection as well as the source
to target mappings.
Three causes, hard to eradicate
If your organisation shows one or more of
these three causes, you have a massive change management challenge ahead that
will take more than a few project initiation documents to remedy. If you don’t
get full support from top management, you’d better choose between accepting this
situation and become an Analytics Sisyphus or look for another job.
Project based funding
Government agencies may use the excuse that
there is no other way but moving from tender to tender, the French proverb “les
excuses sont faites pour s’en servir” [1] applies. A solid data and information
architecture, linked to the required capabilities and serving the strategic
objectives of a government agency can provide direction to these various
projects.
A top performing European retailer had a
data warehouse with 1.500 tables, of which eight (8!) different time
dimensions. The reason? Simple: every BU manager had sovereign rule over his
information budget and “did it his way” to quote Frank Sinatra.
Hierarchical organisations
I already mentioned the study of Prof. Karin
Moser introducing three preconditions for knowledge co-operation: reciprocity,
a long term perspective for the employees and the organisation and breaking the
hierarchical barriers. [2]
On the same pages I quote the authors Leliveld
& Vink and Davos & Newstrom who support the idea that knowledge
exchange based on reciprocity can only take place in organisational forms that
present the whole picture to their employees and that keep the distance between
co-workers and the company’s vision, objectives, customers etc. as small as
possible.
Hierarchical organisations are more about
power plays and job protection than knowledge sharing so the idea of having one
shared data platform for everyone in the organisation to extract his own analyses
and insights is an absolute horror scenario.
Process based support
Less visible but just as impactful, if IT
systems are designed primarily for process support instead of attending as well
to the other side of the coin, i.e. decision support, then you have a serious
structural problem. Unlocking value from the data may be a lengthy and costly
process. Maybe you will find some inspiration in a previous article on this
blog: Design from the Data.
In short: processes are variable and need
to be flexible, what lasts is the data. Information objects like a customer, an
invoice, an order, a shipment, a region etc… are far more persistent than the
processes that create or consume instances of these objects.
[1] Excuses are made to be used
[2] Business Analysis for Business Intelligence pp. 35 -38 CRC Books, a Taylor & Francis Company October 2012