Shubho Ghosal's Blog on BI Strategy

Just another WordPress.com weblog

Big Data Analytics shakes up EDW

leave a comment »

Having been on a break from consulting since the year started, I decided to catch up on my reading and started with TDWI’s very well summarized article on Big Data Analytics, written by Phillip Russom (see link elsewhere on this blog). As the article explains, Big Data Analytics is the coming together of Big Data and Advanced Analytics. Big Data is not just about large volumes, but about diversity in data types (structured, semi-structured and unstructured), and variety in data refresh speeds as well – from real-time to delayed, including traditional transactional data which presumably occurs in discrete, varied time intervals, to sensor-type data which is a continuous stream. Advanced Analytics is a revival of everything that was considered too esoteric for most EDW groups to aspire to, from recognizable techniques such as Predictive Analysis, Data Mining, Complex SQL and Statistical Analysis to some less recognizable ones such as Natural Language Processing (used to understand written text) and artificial intelligence (not sure how this is used yet). There is also content analysis, or the analysis of video and audio, that seems even more advanced.

All this is to be supported on a diverse set of fairly new platform choices ranging from hadoop-based implementations to DW appliances to columnar, analytic, or in-memory databases to in-database analytic functions.

It feels as if under the single banner of Big Data Analytics, all that was exciting albeit challenging to consider and certainly not your everyday topic in DW circles is being revived and made mainstream. Now, a distinction can be made between “old-school” EDW-BI solutions which offer a very structured and fairly predictable model of storage and consumption through well-defined data warehouses, dashboards and cubes, and this new “exploratory” or “discovery-oriented” way of looking at data. So, does the thinking around EDW need to change? And how?

Do we need to throw away our EDWs and move all our data to a brand new platform such as above? One point that was made in the article is that the data that is fed into such technologies needs to be as raw as possible, and that traditional ETL processing should not b applied. So, the data can clearly bypass the EDW entirely and be fed directly into one of the above-mentioned technologies to achieve results. Is the role of the EDW diminished by this and will it become simply a historical source to this new, all-encompassing world?

Not so soon. First of all, a little data cleansing and staging doesn’t hurt. Picture raw data with mis-spelled customer names. A little massaging can only add to the quality of the analysis. Thus, the EDW can be used to stage an optimally-cleansed data set that is used as input towards the analytics. If your EDW architecture includes an Operational (ODS) layer to it that already houses cleansed data from the transactional systems, that can be used as a source for the analytics as well.

Beyond that, the traditional database platform that houses the EDW seems a poor choice for being the main platform for supporting Big Data Analytics. For one, it does not support well the storage of the diverse data types that seem to be desired. Nor will it respond well to the exploratory nature of the analysis that seems to be heart and soul of Big Data, with its fixed indexing and partitioning schemes.

It looks like EDW implementations would have to coexist with Big Data Analytics systems, acting as a source of structured data towards it. Either Big Data Analytics would have to be supported on a separate platform, or both it and the EDW would have to be moved to the new platform (This works if the implementation is a hybrid platform supporting the range of structured to unstructured data). This makes sense as traditional EDW platforms have been notoriously ill-suited to the exploratory analysis that some users desire. This coexistence of technologies that support what is well-defined and what is fuzzy seems apt. Also, the traditional EDW approach with its proven way of handling the clearly understood analytics through dashboards and cubes does not have to be thrown away.

Another aspect to think about is that Big Data Analytics is not just for Big Organizations, or even just for implementations literally with immense volumes of data. The Advanced Analytics aspect of it can be applied to any situation. For example, text analytics can be used to extract nuggets of information from the wealth of textual data collected in any organization. I was at a recent engagement with a telecom infrastructure services provider where it occurred to me that the descriptions being collected for tower development projects could be mined for reasons why projects were being killed. Unfortunately, such topics are still considered too far-fetched. One good thing that might happen with the recent upsurge of Big Data is to bring Advanced Analytics more to the forefront.

In conclusion, Big Data Analytics seems disruptive to traditional EDW approaches at first, but in the end it appears a symbiotic union.

Written by Shubho Ghosal

January 5, 2012 at 3:17 am

TDWI Big Data Analytics User Perspectives

leave a comment »

TDWI Big Data Analytics User Perspectives

A TDWI report on Big Data Analytics

Written by Shubho Ghosal

January 5, 2012 at 2:05 am

Posted in Uncategorized

The Interplay of BI and Social Media

leave a comment »

In order to address the role that Social Media plays in BI and vice versa, first let’s look at the identity that social media will take within the organization. I don’t believe it will look anything like Twitter or Facebook, but it will take ideas from it.

Social media within the corporate firewall will be secured, not only from the public, but within groups within the organization. The key focus will be sharing of information across the enterprise, between groups. Currently, the ways information is shared electronically in organizations is via email, chat, and collaborative portals such as Sharepoint. However, few would argue that any of these avenues currently has the appeal of a Facebook, Twitter, or Linked In. So, these avenues have to grow a little and borrow some of the ideas from popular social media channels, or alternatively, we should ditch those altogether and start afresh.

Social Media within an enterprise has to have a focus. For example, I raised the notion of a Decision Tracking System in an earlier post (distinct from the old notion of a Decision Support System, by the way, and simpler). This can be an application of social media within the enterprise. Decisions will be exposed within select groups and secured from others, but visible to all within that group and open to feedback from members within the group. Analyses put forward in support of the decision could also be added to discussion threads (see my earlier post on telling stories about the data). This could be one of the critical applications of social media within the enterprise.

Sharing knowledge across analysts could be another important application. I got this idea from talking to the head of BI of a well known movie rental and streaming service. This ties in with the story-telling notion as well. The idea is to allow analysts to share their analyses with each other and present their own thoughts along with it and get feedback. This would let folks learn from each other and not repeat each other’s mistakes. 

The more global an organization is, the more social media within the enterprise could have an impact. Think about how Facebook or Linked In have brought together people from across the globe. I have always felt a thrill at being able to connect with a like-minded thinker in Spain, Poland, Russia or elsewhere. It demonstrated to me how ideas can come from anywhere and that some people essentially think similar regardless of culture or geography. Now apply this notion to sharing ideas within the global organization’s many locations. My experience is that more people are likely to open up to a social media outlet than in conference calls or face-to-face meetings. And ideas often come in the middle of the night to creative thinkers.

If the social media application had a data or information focus, that is where BI can play a role, as a provider of the information and a basis for decisions made or conclusions derived. Snippets of BI would go into the analysis that supports the discussion thread. Or, the discussion thread itself may drive the need for new BI assets as well, driving the direction that BI should take.

If we had to take BI to the next level and really connect it to decision making, social media within the enterprise could play a big role in that. The need of the hour is to develop these thoughts into product ideas that would merge these concepts and free up the BI assets within the enterprise.

Written by Shubho Ghosal

July 16, 2010 at 8:07 pm

Connecting BI to Decision Making

with one comment

Being able to connect BI to the critical, strategic decisions being made within the organization is BI’s holy grail. How often does this really happen?

BI is well established in the world of course monitoring and correction – the operational dashboard and scorecard takes good care of that. While the need and ability to put together a BI system, along with a DW, a MDM system, and other information assets is well understood, and the world is full of people who can do this and are excited and passionate about this, there are precious few people who can testify to how often these large and expensive systems were actually used in strategic and/or critical decision making in organizations.

Let us briefly speculate on the nature of decision making in organizations. While this may differ from place to place, one can safely assume that a germ of an idea (or thought or opinion) occurs in the decision maker’s mind. This may be triggered by a market event, a desire to get to the next level, or to survive in the current economy. This germ of an idea grows in the decision maker’s mind, until it is well formulated, at which point it can be shared with the group of influencers. The influencers bring their own opinions to the table, the idea gets threshed out and bandied about, and with time a decision is made.

Information can play a significant role in various points in this process. Graphs, charts, and numbers can get thrown about. One must have confidence in the numbers (data quality). Everyone must see the same results (integration). A widget should be a widget (MDM). The right visualization should be used (Visualization tools). Powerful stories must be told (see my earlier post about telling stories with your data).

The unfortunate truth is, even if this happens, there is no way to track it. Imagine if we had systems where we were able to track our decisions. Wherever we used a BI artifact, that link is preserved. If a story influenced a decision, that analysis is included. The ability to tell a story is provided – this itself is lacking in BI technology right now.

If we were able to do this, imagine the sense of importance that BI would gain within the organization. The ROI would be clearly recognizable. At the very least, people would recognize it as being a strategic asset.

There are some gaps we would have to fill technology-wise, and process-wise as well. This snippet which I wrote as part of a TDWI discussion I started captures the thought -

1. The ability to track decisions and capture the basis of them – call this a decision tracking system, if you will. By the way, what happened to decision support systems at least in the world of business? By “basis”, I mean the link to the piece of BI that supported it, if it did.

2. The ability to tell stories about your data. These are the powerful analyses that sway decisions. BI is not very good at doing this – it seems to be missing from vendor’s foci.

3. The ability to present scenarios (again, this correlates with one of Ron’s thoughts). We had seen some examples of tools that enable this, but they went away. Scenario building is critical as it can simulate the potential outcome of making a decision.

Having these capabilities would bring BI closer to decision making and make it more strategic. Vendors need to recognize this though, and provide the capability for it. Organizations would also have to adopt this within their processes.

More on this later.

Written by Shubho Ghosal

July 9, 2010 at 2:55 am

Does your data tell a story?

with 8 comments

I attended a very thought provoking presentation recently by Jonathan Koomey on turning numbers into knowledge. One aspect of the many things he was proposing really caught my attention – that having a good story is as important as having good data. This opened up a floodgate of ideas in my mind. It reminded me of two examples of very compelling analyses I have seen – both told good stories about their data.

One was by a researcher analyzing the progress of nations across time and making the point that, over the years, some nations have progressed faster in terms of learning. This was attributed to certain factors, and a simple bubble chart that was animated to show movement of the bubbles across the chart with time showed how these nations had moved to towards a better position. All along, the researcher made a compelling case around the data, bringing in the factors that were tied to the movement of data across the chart with time. He achieved this by having a narrative around the presentation of the data.

Another was a New York Times analysis on census data presented on the web. The presentation was unique in that the reader could interact with the graph and see how the trends were for a specific gender, age group or ethnicity. However, the reason why it stood out was that a story was drawn alongside the data and the graphical presentation of it – a story that clicked with the data and helped make sense of it, making us draw certain conclusions that the facts supported.

(When discussing this with my colleague at work, I was pointed to yet another example – this one was sketched in 1869! The link is here: http://upload.wikimedia.org/wikipedia/commons/2/29/Minard.png. This describes Napoleon’s army’s progress to Moscow and back, tracking the way the army dwindled due to casualties – and relating this to climatic conditions. A visionary analysis indeed and far ahead of its time).

In all of the above cases, what made the data meaningful was the story. This wasn’t a case of putting a clickable analysis interface in front of us and leaving us wondering what the heck we are supposed to do with it. The same is true for most BI solutions. We are so caught up with the technology that is being offered to us – in-memory analytics, columnar databases, cloud – that we are missing out on this extremely critical aspect of better BI – the need for technologies or simply the discipline to create a meaningful analysis around the data that we are presenting such that it leads to effective decisionmaking. This is completely opposed to the traditional notion in BI that you need to present clickable interfaces that let you analyze (the vendor classic) “which of my regions are responsible for my drop in sales in August”. It is not that simple.

How about presenting the data as a storyline such that the decisionmaker does not have to click anything? How about presenting each RELEVANT click as another piece of the analysis in a storyline, and providing enough context around the graphics such that the point of each piece of data presented is clear? Providing a headline and then leading us to some conclusions about the data?

I’m sure there are great analysts in organizations that are already doing this – but how? Is the current spate of BI technologies really helping them or hindering the process? Is this why excel is predominant in organizations- because it helps great analysts tell a story? How effective is it in doing that? I once saw a weak attempt at telling such a story in excel and it involved putting hints along with the data – small comments here and there that help you draw certain conclusions from the mess of numbers thrown at you – a first attempt at least but nonetheless a weak one.

Is “Office Integration” all we could come up with? What about putting serious thought into putting the ability to tell the story into the technology? Giving the great analyst the ability to tell their story right alongside the fancy charts and graphs? And NOT asking the users to click?

My guess is that this would transform BI – and allow decisionmakers to actually benefit from having those expensive DWBI solutions around.

I have decided to dedicate my free time to pursuing research on how the information is best presented to make decisionmaking easier – and how technology can aid it as opposed to hindering it with sheer complexity. If you are reading this and are aware of any such research being conducted now or having been in the past, please contact me at shubhoghosal@yahoo.com.

And the next time you present data - don’t forget to tell a story  about it.

Written by Shubho Ghosal

June 26, 2010 at 2:55 am

Sell me the whole car! Not the pieces of it!

with one comment

I wrote an earlier post on requirements for the next generation of DWBI technology directed towards product vendors. This is a follow up on that and an analogy that popped into my mind later about the direction that should be taken by DWBI vendors in general.

Currently, vendors have provided us the components needed to build a car (DWBI solution), but the expectation is that we build it ourselves.

Some have provided us an engine (database platform) – and some have given us a vastly improved and powerful one (columnar dbs and dw appliances). Others have given us a dashboard for the vehicle (BI), and an entertainment system (in-memory analytics). Others wheels (ETL).

Now, every organization just needs to assemble the team of automotive engineers and designers to put it all together. Simple, right? Not.

Why not sell the whole car? There are at least six major vendors that are currently producing all the components required. There are others (open source vendors) that can easily collaborate. How difficult is it to take the next step to start producing cars that we can actually start driving in, that are actually good and can be customized easily?

Easier said than done. But critically important. This will change the ballgame, in my opinion.

Written by Shubho Ghosal

December 12, 2009 at 5:43 pm

Requirements for the next generation of DWBI technology directed towards product vendors

with 4 comments

This post was prompted by a TDWI Linked In discussion on whether 20th century architectures are applicable to the 21st century. I have always felt that new technologies tend to be technology driven and not entirely business problem driven. Such new technologies come in waves. The current waves are around columnar databases and in-memory analytics. However, they address niche problems and don’t look at the entire problem holistically. For example, columnar database vendors are entirely focused on the performance aspect of ad-hoc queries. Vendors are so focused on that performance aspect that they have provided scripting interfaces for actually designing and implementing the solution. Thus, there are no interfaces or easy means to design performance management systems. In fact, such vendors are promoting an ELT approach, which sounds great but doesn’t make much sense in practice. So, while providing us a service, they are only doing so in a narrow way, and there is still a lot of work to be done in terms of putting a solution together. Likewise, in-memory analytics are focused on one piece of the problem, which is sub-second analytics for power users. There are many other consumer roles in the enterprise that need to be addressed, and such solutions don’t put much effort into the design aspect either, or focus on how the in-memory solution would integrate with other required technologies.  While both technologies are great advances with respect to what we had in the 20th century, they don’t look at the entire problem we DWBI practitioners face – making DWBI pervasive in the organization.

Piecemeal technologies need a lot of effort in putting them together into an entire performance management solution. This continues to be a barrier to more widespread adoption of DWBI in organizations and an improvement of their reputation. This thought prompted me to write down a set of requirements for the next generation of DWBI technology. My hope is that a vendor will step up and tackle this holistically, and not address it in a piecemeal manner.  Along with each requirement, a list of features for a better product has been requested of product vendors.

Here’s to a better future for DWBI!

Ease of Implementation End to End (DWBI In the Box)

Ease of implementation can make a huge difference to data warehousing in the next generation. DWBI efforts tend to be massive and complicated. One of the reasons is the difficulty and complexity of such engagements. You need a DW Architect, a BI Architect, a team of ETL developers and BI developers. Partly, this is because of the inherent complexity in ETL and BI coding and DW architecture on relational databases. While recent advances in database technologies have simplified the problem somewhat by eliminating the need for summary tables, the lack of focus on the usability and design aspects of performance management solutions as a whole will mean that no relief is offered towards easing the complexity of DWBI solutions. They will continue to be complex, costly, and change-unfriendly.

Functionality Requested from Product Vendors -

  • End to End DWBI in the box – Put everything together in one box such that we don’t have to keep putting together solutions with many different components. A DWBI appliance, if you will. This includes data structures (logical data models), ETL, and BI in all its forms – dashboards, static reports, interactive applications - and OLAP, but with a business focus rather than a strict dimensional focus. See my point on supporting business analytic constructs below.
  • Easier ETL/BI development – Higher level ETL/BI interfaces that are more business rules driven. The ability to take some low level building blocks that capture common transformations and put them together into more complex transformations that can be shared across applications. The standardization of some complex transformations.
  • Focus on Usability in Design – High level design interfaces that allow us to build performance management systems with all their layers – metrics and hierarchies – bottom up or top down. The ability to connect to source system data structures easily.

Support of Objective Oriented (Requirements Driven) Design

We need to be able to demonstrate that a performance management system’s design is closely linked to requirements – objectives, areas of analysis, business questions, or other types of requirements.

Functionality Requested from Product Vendors -

  • Provide the ability to capture requirements as part of the box, and the ability to link the requirements to the design of the performance management system. This will help demonstrate to the business users the clear links between requirements and design.

Adaptability to Evolving Business Needs

Even after an organization has put in the huge initial investment into its DWBI solution, it needs to continue investing in adapting it to changing business needs. Not only is this a huge cost issue, but a huge Time to Deployment issue as well. One major reason is the disintegrated nature of a piecemeal solution. The same metric, dimensional attribute or hierarchy exists in a number of different systems in your entire DWBI environment. Most of the time, the number of points of existence aren’t even tracked or clearly understood. The item exists in the data model, in the ETL layer and in the BI layer. When such a change is proposed, usually it takes the implementation team upfront analysis time to even track the multiple points of change. The effort then needs to be scoped and implemented. A simple change can not only have significant cost, but can be very time consuming as well, because of the ripple effect of the change across the multiple systems it is implemented in. This can be hugely frustrating to the organization, which expects a quick turnaround to evolving analytic needs. Another big reason is the complexity of ETL and BI development. This has already been addressed in the Ease of Implementation End to End point above.

Functionality Requested from Product Vendors -

  • One point of definition of a metric or attribute – By putting End to End DWBI in the box, you are already a big step towards addressing this. Now, ensure that a metric or attribute is stored in one place only. This means that any change will automatically ripple across the entire system. Of course, you need to ensure that development and production systems are handled as separate systems, and adequate notification can be provided for a change. Those are also required.
  • Changes trickle across the End to End solution
  • Versions of the system are tracked such that an older version can be reverted to.

Support of Business Analytic Constructs rather than Dimensional Constructs

Strictly Dimensional  constructs do a poor job of representing business analytic entities – metrics, and the context around them. The item “Products” can represent not only something that is sold and earns revenue  (an item in the level of a hierarchy of a Products & Services dimension, linked to the Revenue metric) but also an Organizational Unit (an item in the Organization hierarchy – the Products division of the Organization, linked to the Expense metric). Dimensions “merge” in business analytics to become such “things” that can go across multiple hierarchies – the Products & Services hierarchy and the Organizational hierarchy, in the above example. Unfortunately, BI products still retain a strictly dimensional flavor that makes it harder to fit it to such “things”. Also, they separate the “metadata” from the “data”. This makes it much harder to create these “things” (e.g., “Products” that go across dimensions named “Products & Services” and “Organization”) in a design – if that makes any sense.

Furthermore, not many BI products allow you to create a system of metrics that typically occur in a performance management system. Metrics are not only Revenue and Cost. They can be “Product Revenue” and “Services Revenue”. They can even be “Q1 Revenue” or “Last Quarter Revenue”. Some of this is supported in some niche environments, but not holistically.

Functionality Requested from Product Vendors -

  • Support for business analytic constructs rather than strictly dimensional constructs. This means supporting the design of systems of metrics that “merge” with the context,  support for the context itself (the “data” as opposed to the “metadata”) in design, and the support for contexts (e.g., “Products”) that can “belong to” multiple hierachies, and is seamlessly shared across.

Performance Performance Performance

Yes, performance, scalability, sub-second response, reliability, and all those good things are still important, but pay attention to the whole problem, not just a niche area.

Bottomline

DWBI continues to be a costly, time-consuming, ungainly beast that is resistant to change. Hearing out these ideas will make a big difference to the community and the profession in the long run. Hope someone is listening.

Written by Shubho Ghosal

December 9, 2009 at 10:38 pm

Mo’ Better Faster Cheaper Trustier Information

with one comment

Or, The Importance of an Information Strategy

The importance of having a clearly formulated, enterprise-wide and long-term Information Strategy, as part of the company’s overall strategy, has grown like never before. Information is critical to making the right decisions. This critical information includes what is going on right now (Tactical or Operational View), what happened previously (Historical View), and what is likely to happen in the near and distant future (Strategic View). Not having the right information when making critical decisions must be gut-wrenching. You don’t know if it is the right thing to do. Having poor information is as good as not having any information at all.

This is true for organizations at any stage of the growth cycle. Even an Early Stage organization can benefit from an Information Strategy. Typically. at that stage, one isn’t worried about having to consume large amounts of data. The list of customers may be small and the product list manageable and one always has a clear picture of what is going on with the business currently. All reporting and analytics can be achieved in a small database or spreadsheet. However, thinking in terms of a clearly defined Information Strategy will still help, because the exercise itself will help the business understand the right performance metrics to use to measure itself, and the need to have a clearly measurable strategy.

An Information Strategy becomes more critical at a stage when the business has grown from being a mom-and-pop shop with a limited set of customers and products it intimately understands, to a medium-sized organization having a multitude of customers and products that one can no longer hold in one’s head or on a single spreadsheet. Not looking at those customers or products holistically can result in critical business opportunities being lost (cross-selling, up-selling, the right new product introduction to the right market, getting the best rates from the right vendors – decisions based on a better understanding of customers, competitors, products, and vendors); lost opportunities that a competitor can take advantage of to gain market share. The flow of data can be so quick that not being able to translate that to effective information, results in a fragmented view of customers, products, and vendors. (Customers, products and vendors are being used as examples only. The same concept applies to other types of business or organizations equally.)

Not having an Information Strategy will probably result in misinformation being spread across the enterprise. This may mean the right metrics not being used. It may mean poor quality information being used. Different organizational groups may be seeing different versions of the same numbers. They may be pursuing narrow departmental agendas, perhaps using metrics that only show a narrow aspect of their performance in a better light. A balanced, enterprise-wide view is probably not available to any portion of the organization. This is primarily because no one knows what the right information is that each department and individual should be measuring against. No one knows what the strategy is.

Just like a business without a strategy doesn’t know where it is going, a business without an Information Strategy doesn’t know if it is going in the direction that is chose, or whether it should change it’s direction because it is clearly the wrong direction to take. Only the right information will allow a business with a well-defined strategy to understand how well it is executing on its strategy, and whether the strategy itself is delivering on the right results, before it is too late. Too late means lost customers, lost revenue or lost profits.  It may even mean the collapse of the business itself.

To be successful, an Information Strategy has to be part of the overall business strategy – one of its components. It should be owned and driven by the business. Too often, the delivery of information becomes part of an organization’s DWBI initiative, with IT driving a mainly technology oriented requirements gathering program. The result is often a low-level (detailed) data warehouse that the business does not understand, with a BI tool layered on top that the business is forced to adopt, even though it may not even remotely meet their needs. Consequently, the business falls upon the old, reliable, but ultimately crippling system of spreadmarts – silos of information sitting on individual users’ spreadsheets, based on detailed data from unreliable, inconsistent sources, with hours being spent on transforming the data into information, with the same data cleansing processes being applied over and over again by highly paid analysts who get precious little time to analyze the information before decicions need to be made upon it. The content, quality, timeliness, cost and, ultimately, the most critical aspect of it all, Trustworthiness, of information being used in the organization, suffers.

On the other hand, the information in an organization that has a clearly formulated Information Strategy is a lot Mo’ Better (Higher Quality), Cheaper, Faster (Timely), and Trustier (Reliable, Decisive, Conclusive), resulting in the organization that clearly knows where it is going, knows that it is the right direction to take, and can make the right decisions based on the most reliable information available, without fear. An Information Strategy will result in a Decisive Organization.

Future posts will explore the elements of an Information Strategy, starting with the elements of an information strategy, the process of gathering requirements to define an information strategy, and the technology platforms available to implement the strategy. Feedback welcome.

** This is one of selected posts that will also appear on dwbistrategy.blogspot.com. Please add yourself as a follower to the dwbistrategy site if you wish to follow my posts. Thank you for reading.**

Written by Shubho Ghosal

October 24, 2009 at 9:31 pm

House on Stilts

with 2 comments

A BI solution without an underlying DW is like a house on stilts. Whether it will survive the high waves of analytical queries is anybody’s guess. Some vendors have convinced their clients to layer their BI technology directly on operational source systems. This puts consultants like me in a difficult position – telling clients that have already bought into that vision that they are heading for disaster. The BI tool can do it all – so why do we need to over-engineer this again?

Not putting a well designed DW in the DWBI architecture is a huge and costly mistake that will result in failure in meeting the performance criterion that is expected of BI. Operational source systems are not designed to handle the load of analytical queries. They have a spider web of tiny little tables that are good at handling single operations related to processes, and storing an attribute, such as Customer Name or Address, in one place only, so that any updates are to that one point of truth.

To support analysis, you need special data structures, called Facts, and Dimensions around it. Take the Opportunity Lifecycle of a Sales process, for example. The process starts with a Lead, which goes through many phases, eventually ending in an Order if successful. An easy way to track that is to have a Fact with each Phase as a Date attribute. You can calculate Days to Close by simply subtracting one Date from another. You can also track the probability of achieving every phase, alongside the Date. You can now easily create averages across Opportunity Type, Product Line, etc. in a BI tool. Imagine doing this against a database that has each Phase stored in a separate table. You will have to stitch across each table, opportunity by opportunity, and then average across. You will have to virtually construct the “Fact”, using standard SQL. The performance will obviously be sub-par.

Just because a BI tool can interact with any database does not mean you can put it on an operational data source and expect it to perform well with highly summarized queries that have to stitch together bits and pieces of information from across the database. For a well-performing BI solution, you need a strong foundation. That is the role the DW plays. Don’t get swayed by the idea of doing away with DW. Get real. Technology isn’t there yet.

Written by Shubho Ghosal

September 26, 2009 at 3:08 am

Process vs. Technology

with 4 comments

No amount of technology will address the results of a poorly established or loosely followed process. Accurate analysis relies on clean data. Well defined processes result in clean data. A loosely followed process can result in critical missing data elements, which takes manual intervention in cleaning it up for analysis, wasting valuable time and resources.

Take, for example, the order entry process for a water-cooler manufacturing company that directly sells water coolers to businesses, and installs the product at the sites where the business is located. An invoice is given to the business at the time of sale. The invoice is generated from a Order Entry System with the minimal information required (Customer Name, Product Name, Sale Price, Discount, etc.) but information on the Sales Person(s) associated with the data is not always entered, as it is not “required” data.

Later, at the time of calculating commissions for the Sales Organization, gaps in the data will result. The person responsible for figuring out the Sales Commissions would be running pillar to post trying to figure out who to assign the missing commissions to, perhaps even manually manipulating incoming files for the Commissions report after making the right phone calls, to get it there on time.

Such situations do happen, and much as one would like to throw technology at them, technology is no substitute for tightening up the process that resulted in the situation in the first place, starting by assigning ownership for the process. I ran into a similar situation recently, and the importance of establishing strong processes and strong ownership resonated with me. I am normally not a process-centric thinker, but for once, I appreciated the need for it, as it resulted in data quality issues with a DWBI solution which technology alone just wouldn’t resolve.

Written by Shubho Ghosal

September 25, 2009 at 3:38 am

Follow

Get every new post delivered to your Inbox.