Why Building a DIY Data Warehouse Is a Bad Idea

Table of Contents

    Be honest with yourself, would you want to build your own CRM or accounting application?

    I’m guessing you’ll say “No”.

    And why not? Because it would be too complex, too time-consuming, too costly, and require management of developers and development of infrastructure?

    Yes, yes, and yes.

    So tell me this: Why are you building or maintaining your own data warehouse?

    The Doomed Birth of a Data Warehouse

    A few years ago, when several in the company I was working for wanted to upgrade our Microsoft Office software to its latest version, we were told we couldn’t.

    Why not? Because we were advised, it would compromise the company’s “business intelligence” system. It seemed to us like a needless conflict of interest.

    When I mentioned the incident to a colleague, he told me his story of a doomed effort to birth a data warehouse. He told me that at a previous place of employment, an intern with some bright technical skills had been asked to put together some metrics around a sales effort.

    Having an affinity for programming, but given only modest means to proceed, the intern used MS Access to build it. He installed a prototype on virtually every computer in the company. Soon, a jumble of cross-application Access and Excel charts and tables were flying around.

    Eventually, a few managers—many of them lacking an actual understanding of the metrics in the reports—liked the prototype so much that they elevated it to become the reporting standard for the company.

    Not much later, the intern left, having gone back to school or gotten a job elsewhere, leaving the IT Manager to fend for herself to find someone that could turn the application into a reliable reporting system for the company.

    But who could help? A database analyst? A business analyst? A BI specialist? A reporting developer? The IT Manager was plagued with questions about cubes, OLAPs, drill-downs, rollups, scorecards, dashboards, data marts, data lakes, big data, and self-service. The list was endless, and she was overwhelmed.

    A budget was eventually approved for the project, but it was barely enough to give the IT Manager the means to build a team, much less construct a company-wide BI platform. The logic behind the decision? Well, after all, “an intern did this before. So how hard can it be?”

    Three years later, the company finalized an Oracle database with beautiful star schemas and a few other sophisticated features. Everybody was ready and waiting to plunge into the next phase of the project, as they were still operating with a minimal budget. After all, the business had yet to see any results from the efforts so far.

    But before they could add some of the more practical features like data visualization, they were told that, due to the new Office 365 apps being installed, they would have to move everything they were doing to the Cloud. But they couldn’t!

    Turns out that the young intern’s “data warehouse” was still lingering around the system, and it wasn’t compatible with the new version of Office. Most of it hadn’t even been refreshed in ages (it had been three years, after all); many of the metrics were no longer accurate; and product codes in the system were outdated and no longer being used.

    But some people somewhere were using the antiquated code; they had built additional reporting and other macros that were too embedded in everything to be able to pull it out cleanly and simply.

    3 Reasons Why People Mistakenly Choose DIY Data Warehouse

    It’s a bit disconcerting that, while most companies easily move their transactional systems such as CRM, ERP, and Finance to the Cloud, when it comes to reporting, they still do their own “cooking.”

    I talked with numerous colleagues and clients who attempted to build their own data warehouse to better understand their motivations. I discovered there are three basic reasons people attempting to set up a data warehouse have for not considering using a cloud platform. They were:

    • We don’t know what we don’t know
    • We like to tinker
    • We like to be in control

    Here’s my response to each of those answers:

    1. We don’t know what we don’t know

    Unfortunately, many companies decide to implement a transaction system and then think about what metrics they want to measure. Truth is, you need a “playground” to investigate what data you have and what you can do with that data.

    The BI implementation person needs to get their “hands” in the data by putting the data right beside them or even in their own laptop. In other words, don’t be satisfied with not knowing. Find out.

    2. We like to tinker

    We data people love to tinker, it’s true.

    We love databases because they do our will. We group and partition and select. We aggregate or divide. We filter and roll-up. And the database does it.

    But more than that, we love figuring out how to do things ourselves. We tell ourselves that by doing so, we build more robust processes and better-stored procedures, and anyway, we will migrate to a more powerful database running on super machines, so the performance will come later.

    At what cost do we tinker?

    3. We like to be in control

    Now, much like in the early 2010s, there are still people preaching against the Cloud.

    They mistakenly think that moving their CRM and financial systems to the Cloud is equivalent to giving the core of their business away. But even if your business is the data, you won’t be giving your business away in any way, shape, or form.

    The myth of lack of control abounds: control of how the data would be kept safe and secure; control over who can access it; and concern about knowing it can be accessed at any time.

    The non-believers have diminished in numbers, but data warehouses are still often seen as the “heart” of the data and many companies still defend against it being moved for fear of lack of control. They don’t realize that cloud-based data warehouses offer stunning levels of control.

    What’s Wrong with Doing it My Way?

    Simply put, do it yourself, and your data warehouse will become a problem as soon as it is built.

    Whoever set up your data warehouse set up some “interesting” flow of data from a variety of systems and databases using SQL code, custom code, and connectors to connect to APIs, and a few macros to connect to local files.

    Plus, they likely set up a local database to hold some supporting data. The data warehouse becomes a supported system— and it must be backed up and maintained like all other IT systems that are critical to the company. Once launched, a few things will inevitably happen.

    People will want data visualization

    A data warehouse really provides no value for a business unless the data can be accessed and the decision-makers can get the dashboards, reports, KPIs, alerts, and insights they need to do their jobs.

    data visualization and dashboards concept

    But those who built the data warehouse are not UI experts or developers. They are data folk, not programmers. That means that IT will need to look for charting software to plug into their data warehouse.

    Yet they won’t want to pay licenses per user, build additional systems, or buy new servers. But they will have to—there is no other way of going about it.

    The data is where it’s at. The data viz and user access will work better if they are close to each other.

    Systems will change and software will need upgrades

    Cloud apps might change their APIs, system upgrades might be coming soon, and you just might discover that the person who built your data warehouse has been promoted or left the company for a better job.

    The fixes might be harder than expected to accomplish. Or maybe the database storing your data needs a new version to make it security-compliant. Or maybe the volume of your data has become so large that backups are taking too long or even failing.

    That’s when you start looking for ETL tools, a DBA, and a few other costly resources and people to just keep the engine alive.

    Users will want more

    It could become a truism; if it was built well, users will want more.

    Let’s face it: nothing is static in business for very long. Your unwieldy data warehouse will need changes, and IT—if you even have IT—will be responsible for making those changes. Otherwise, people will lose interest, and your project will join Windows 95 in the annals of history.

    Start implementing now a Cloud-based Data Warehouse

    The brutal truth is, there’s really no good reason not to move your data warehouse to the Cloud anymore. If you choose a DIY warehouse, you’ll be facing an upgrade soon that will undo everything you’ve built, and you’ll waste a lot of time and money.

    If you choose the Cloud, worry not. You can still tinker, you can still be very much in control, and you will still be able to explore and leverage all of your data in the Cloud. The only requirement is that you have internet access. How easy is that?

    When you’re ready to step up to a cloud-based data warehouse, be sure that your BI tool includes intuitive data visualization, ETL/data management, quality tools, user publications, alerts, and security all on the same platform.

    Try ClicData’s data warehouse for free and have all the control you want.

    You’ll also enjoy: