Monday, November 17

Worst practices, who's got them


Learning from the mistakes of others is the way to go. No need to blaze a new path. It's all been done before. And BI has years of this "experience"; DW has "failures" from decades prior too.

This is backed up by Madan Sheina's article:
  • 87% of BI projects in the UK don't live up to expectations.
  • 25% of those projects are going over budget.
  • Only 50% of end users were satisfied with the BI system
Maybe you're planning to implement BI in your organization. Perhaps you have a BI system but you're not getting the benefits you expect. It's an unruly world in the BI space. Data is a major problem. Tools are another. It's costly and takes lots of your time and effort.

So what are you going to do about it?

You could read Peter Graham's posts about Kevin Quinn's "Worst Practices" whitepaper (saving you from signing up your email address).

Or read the following, which incorporates Kevin's ideas of worst practices but stays away from the marketing of a toolset vendor.

Let's start with the customers, or more specifically the business user. After all this is who BI is meant for. I figure adoption of BI tools is really low. I don't have a specific statistic but I find them difficult to use. People don't have the time to learn a new tool AND don't have the time to understand how to interpret the data.

Apply Tom's Spandex Rule: Just because you can wear spandex shorts, doesn't mean you should! (you know who you are)

Match what the people are asking for and reign in your project team. They can do it all given enough time. A zillion reports. Lots of dimensions, filters, and ranges. However, your BI system should provide what is being asked for. Stay away from the "it would be cool if..." until several years later when people want more.

Ah Excel.

Should you force people away from Excel to use this new BI system? Maybe. Maybe not. Some say Excel is preferred because of it's familiarity and simplicity. Well I think the majority of people who have used Excel don't know how to create a pivot table or know why they should use one.

So in my opinion, Excel is used because it is quick! No involvement from IT. They can do it themselves until it works for them. Typically they are looking to solve a business problem, they just need the evidence.

BI just made their lives a whole lot more complicated. Now they need to write a request for what they want. Then when they get it, it's in this strange BI tool. At which point they copy it into Excel and continue on. Granted they probably have more data than before and cleaner data for sure.

Which brings us to the data warehouse. Yes, a storage area for internal data. What about data external to the organization? Many would caution the use of external sources because of inconsistencies, not having control of the quality, difficulties aligning with the data warehouse, etc.

Let me ask you, "who is the biggest data provider in the world?" Google. What do they do differently? They allow the consumer to determine whether the data is relevant and give them access to everything. They've empowered people with information access.

We need to let it go. The days of the data warehouse or BI team or IT department controlling all the data should come to an end. The business should get access to information they want and need, internal or external, in a BI tool or Excel and when they want/need it.

For instance, Chevron has 200,000 employees worldwide and uses MSFT BI to deliver specific, targeted information to many of them. Starbucks uses Microstrategy to publish in-store metrics to every store manager. Boeing uses MSFT BI to provide manufacturing performance metrics for managers.

These are all extremely large organizations and yet they aren't pushing out tons of cubes, reports, and dashboards. BI is not a central focus for their employees but it does give them targeted information specific to their needs.

Remember Tom's Spandex Rule. Just because they could, doesn't mean they did.

Friday, November 7

10 Questions with Mark Windrim

The "data warehouse for anyone who needs it" has gone open source.  I previously talked with Miriam Tuerk, Infobright’s CEO, when the company launched.  Recently I spoke with Mark Windrim, VP of Community for Infobright, who is at the helm promoting their open source data warehouse, Infobright Community Edition (ICE).  Mark, as I discovered, has a history of spearheading initiatives from their infancy.

In the 90s, Mark started a community called MAGIC while working at Apple Computer. MAGIC grew quickly into one of the largest ISPs serving the Toronto, Canada region. At that time MAGIC was the largest Macintosh user group in the world. He left to join Newstar Technologies in 1996 (where he met Miriam). Newstar evolved and became BCE Emergis.

Now his focus is set squarely on building an open source community for Infobright. It won't be easy. Starting a community in an established company with an existing product - a paid product at that – can be even more challenging.

Question: So Mark, let's get right to it. Why did Infobright decide to go open source?

Answer: As you may know, Infobright built its technology underneath MySQL so we have always had open source roots. Open source has always been part of our plan. In terms of timing, we wanted to prove out our technology with early customers before releasing an open source version. We wanted to make sure it was highly scalable, easy to download and implement, and robust - not an immature project. We see a large untapped opportunity in the market to bring an ultra low-cost, easy-to-use data warehouse to many companies who don’t have the big budgets and resources needed to implement traditional solutions. Open source is the best way to enable them to try it.

Question: What does your community ecosystem look like today?

Answer: We’ve got our main infobright.org site that contains information about ICE. It supports a Wiki and forums where users can mingle and get support directly from Infobright and other community members. We have a great relationship with MySQL/Sun and continue to work closely with them in supporting the open source data warehousing community with ICE. Additionally, we are building partnerships with many organizations. Today, those partners include Pentaho, Jaspersoft and Talend, and we work with their communities in addition to our own.

Infobright also started to build out our integration communities through relationships with OpenBI in Chicago and Lincube in Stockholm, Sweden. Many more are to come. Both OpenBI and Lincube have extensive experience in implementing BI Solutions with open source offerings.

Question: What is the planned direction for your new open source community over the next few years?

Answer: We will be adding a great deal of educational content about data warehousing. The Infobright software itself will be enhanced and updated on a monthly basis. Since the ICE launch, we have already added a 32-bit version of the product and a new release has been posted to the community.

The goal is to create an environment where people can go to be introduced to data warehousing, as well as receive help both from Infobright and the community itself. We’ve begun offering free webinars as an introduction to data warehousing, and the response (and attendance, I might add) has been very good. Many of our community members are turning to us because their current platform is no longer able to meet their requirements, and they’re starting to build data warehouses as a result. Normally that is a very expensive proposition but open source, and ICE, has made it available to the masses. We want to be that go-to location when individuals and businesses need to know how to build these systems.

Infobright will also be adding content about implementing an open source data warehouse through support of our partners. We’re already moving down that path and recently announced an open source bundle with Jaspersoft. Shortly you will see more of this as we try and make the on ramp to open source data warehousing as easy as possible.

Question: What challenges have you encountered starting an open source model within a company selling a paid product?

Answer: We had to engage the entire Infobright team in the vision of a different kind of company. Everyone’s job was going to change in some way and the venture would only be successful with 100 percent employee participation. To be honest, that education process was fairly complex. We implemented a comprehensive plan for the entire company over six months to ensure that everyone was on the same page, and that we had everything absolutely ready. We also researched other open source projects – both successful and unsuccessful ones to understand the best practices of open source. Those learnings have impacted everyone in some manner and through them we have developed guidelines for all to follow.

Question: Infobright's open source database was released in September 2008. So far, what has been the response?

Answer: Response to Infobright’s open source announcement has been incredible! Immediately after launch it became apparent that the community wanted a 32-bit version of ICE, even though we had previously thought that the software needed more memory in the form of a 64-bit version. We responded immediately by offering a 32-bit version, and it quickly became the hottest download on the site. Infobright is also receiving reports from community members that they are using ICE in ways that we never would have thought was a great fit for us, but the sheer volume of people using the platform is giving us some fantastic new insights into where we need to focus our future efforts. I don’t think there is a day that goes by where there isn’t an “oh wow” comment coming from a community member that loves ICE. Now the community is asking us for Windows support, so you’ll be seeing that coming down the pipe shortly.

Question: How has the open source initiative affected Infobright as a company?

Answer: As I mentioned earlier, the way we do virtually everything has changed. Infobright.org community members commenting on code, or recommending new features, has a direct impact on our product roadmap. With many more people using Infobright’s software, we are discovering new ways people can benefit from its use not previously considered. For example, we are now looking into use cases that didn’t make sense to us six months ago - such exploration is a direct result of community members having an impact on the organization. From a business perspective, the last six weeks has been a hockey stick in terms of number of new customers and revenue. By the end of Q1, we expect to have surpassed most, if not all, of our competitors in terms of number of customers using our product. And more users result in a better product – we aspire to deliver the best product in the market!

Question: Has the direction of Infobright changed as a result of being an active part of the open source community?

Answer: Yes, absolutely. The expectations of the open source user are different than that of the commercial one. The open source user demands from a technical perspective are very high – you can’t hide issues and weaknesses; your bug list is entirely public. They’re doing a phenomenal amount of testing and performance stress testing. All these insights benefit every part of Infobright as we add new features to ICE. Such feedback is even changing the types of webinars we present, which user groups on which we focus, and what features we add to the product. We also closely monitor a variety of metrics in order to determine the best ways to support the Infobright.org community.

Question: There is probably the value proposition of lower cost, less implementation time, and less maintenance effort. However, can an organization really run an enterprise-wide data warehouse on open source?

Answer: Every aspect of a data warehouse infrastructure can utilize open source technologies. A company can utilize Linux as the operating system and integrate with open source ETL tools like Talend or Kettle (Pentaho). Many open source platforms will use their own home grown ETL processes built upon Python, Perl, or Ruby (or C/C++, for that matter). The data warehouse can be built upon ICE, and users can use BI tools like Jaspersoft or Pentaho. Support is readily available for purchase or for free through community forums. Every one of these platforms is extremely robust and stable. There is no reason that an organization could not run their data warehouse using open source technologies.

And businesses are recognizing this. Open source BI is a fact of life in enterprises across industry. A recent Gartner survey highlighted that by the end of this year 69 percent of enterprises surveyed either plan to have implemented and be using open source databases. And in the area of BI tools, the number is 34 percent (a 100 percent growth over the previous year). These statistics were gathered before the recent economic downturn. Now, companies are even more pressed to find the fastest and most cost effective way to deliver BI and data warehousing solutiuons to their businesses – and that way is open source.

Question: What is the benefit to your partners and developers of being a part of Infobright's open source community?

Answer: In addition to the obvious and huge benefit to end users, the data warehouse is a key enabler of many enterprise applications. As such, our partners and developers will be able to achieve great benefits to their businesses and products thru integration with Infobright. Now organizations can accelerate their business by reducing the cost and time required for the data warehouse component and focusing their value on the application. Additionally, our partners and developers not only support our community, but they cross pollinate each other’s communities as well. With open source, there is much more willingness among partners to support each other.

Question: It’s been a pleasure talking with you Mark. Do you have any additional links or information about Infobright's open source community you want to share?

Answer: We really encourage people to get involved in our forums (www.infobright.org/Forums). There is a vast amount of information about ICE available there and we really appreciate reports on how people are using ICE. We’d also encourage people to look at Pentaho, Jaspersoft and Talend.