Data Modeling: Definition & Optimization Examples

Table of Contents

    Have you ever tried to make this Sales dashboard your CEO is raving about a bit faster to load? Did you ever wonder if you make the best usage of the multiple set-up options of your ClicData account?

    Maybe it’s time to start an optimization project, from the Data Model to Dashboard creation’s best practices.

    Who wouldn’t want to save some GB and provide the best user experience to his favorite colleague, when he’s consulting your caringly crafted dashboard?

    What is Data Modeling?

    Data modeling is like creating a blueprint for organizing and understanding data. It’s a process where you structure and represent data to make it easier to analyze, manage, and retrieve.

    Imagine you’re building a house: before construction begins, you need a detailed plan that shows where each room will be, how they connect, and what materials will be used. Similarly, data modeling lays out the framework for how data will be stored, related, and accessed within a database or system.

    There are different types of data models: conceptual, logical, and physical models, each serving a specific purpose.

    1. Conceptual models focus on high-level relationships and concepts,
    2. logical models dive deeper into the structure and organization of data without considering specific technical details,
    3. Physical models translate the logical design into a database schema, considering implementation details like data types and storage mechanisms.

    In essence, data modeling is the foundation of effective data management, ensuring that data is organized in a way that facilitates analysis and decision-making.

    A good data model will enable you to get the best performances for your dashboards, but also for the consumption of your storage space, refresh power and maintenance time.

    Usually, we mention data models in relation to databases. Let’s see how to apply some of the best practices to your ClicData account, step by step.

    Data Modeling: Optimization Best Practices

    On the data side

    What defines a good Data model when it comes to datasets? As often, it depends… on your own very specific situation.

    Think of your data sources, the overall project and sharing objectives. Let’s consider some best practices that may apply to your case.

    1. Pull only the data you need

    Wherever you can, limit the data pulled to the only columns and rows you really need for reporting and ETL purposes. There is no need to overload your account with unused data, as it will slow down data processing and all dependent calculations.

    Also, it will save some storage.

    2. Create custom datasets from databases

    Use the query editor to create the perfect dataset, tailored to your needs. Think of who will use these datasets and whether they will need to perform transformations on the data.

    If yes, go for chunked datasets, for example, 1 dataset per table from your database. This will allow ETL actions to be most efficient because performed on smaller datasets. Tables can then be joined via a Merge.

    Blog Custom Datasets

    If no transformation is needed, go for a dataset directly combining different tables from your database, creating one dataset per type of usage, e.g. Finance, Sales, CRM, Marketing, etc.

    Blog Sql Query

    This will result in the same table as with the above-mentioned Merge:

    Blog Data Table

    Always try to go for the most granular level of data that you will need for your visualizations. Aggregations can then be built upon datasets.

    For example, use daily metrics if you need Daily metrics, Day over Day evolution metrics or Weekly metrics. Use Monthly grain if daily monitoring is of no use.

    3. Use appropriate Data types

    With the June 2019 product release, you can finetune data types for each column of your data. This will be particularly useful when working with non-database sources, as these datasets might not be optimized for performances yet.

    Use numerical values whenever you can. Using Text will consume more storage space and will be slower to proceed when evaluated.

    Blog Appropriate Data Types

    In this case scenario, the column [Convertible] can either have ‘yes’ or ‘no’ as a value. Consider lowering the maximum size of this column to 3 characters rather than 250.

    4. Take advantage of Cache feature

    The cache will take up storage but will also increase performances theatrically. Datasets such as Views, Fusions, and Merges created via the ETL can be cached.

    Whenever a dataset times out, consider caching the datasets that feed into, especially the larger ones.

    Using the Dependency Viewer, check which data feeds directly into the final dataset and apply to cache accordingly.

    Blog Cache Feature Clicdata

    In this case, caching ‘Sales Continent & Brand’ and ‘Sales – Japan’ which feed directly into ‘Sales Continent & Brand View’ will ensure top performances for the latter.

    If ‘Sales Continent & Brand View’ is heavily transformed as well and depending widgets in the dashboard are slow to render, cache this View too. Learn more about why and when to cache your data.

    Blog Cache Data Clicdata

    Also, don’t forget about low-hanging fruit optimizations for your storage. You know, those that we forget most often?

    5. Clean the trash


    6. Check out stored versions

    This feature is very useful for non-persistent datasets, such as for example data available through Facebook’s API and the Facebook connector.

    Do you need to keep track of the 10 last versions of a dataset? Maybe not! If the data is still available in the source, don’t worry about keeping history in ClicData.

    Blog Data Storage Clicdata
    This account stores more History and Trash than Active data

    On the Refresh schedules

    Refreshing data sources is vital to maintain an efficient dashboard set, displaying accurate and up to date data to users.

    Here are some best practices to keep your Schedules workspace tidy and performant, while saving on your refresh quota. Try to implement them right away when setting up your automated refresh schedules, even if you feel that you can always come back to this later and optimize. Do it right from the start!

    1. Optimize for time frames and activity days

    Are your dashboards consulted 24 / 7? Probably not. Your data only needs to be refreshed when the final visualizations are consumed by users. Think of the cadence that is the most appropriate to your business and consumption context of dashboards.

    A typical set up would be a refresh scheduled only during working hours and days. In the case scenario of hourly refreshes run every hour, this simple optimization will save 77% of refresh quota if you switch to working hours only.

    Refresh your data up to every minute during this period if you need to monitor business in real time, for example via a screen displayed in the office. Counting sales in real time will become addictive and stimulating for your team!

    Refresh your data once a day if dashboards are consulted once a day.

    2. Enable your users to refresh data whenever it’s really needed

    To give users – even Viewers – the ability to refresh data in real time from the dashboard directly, set up an interaction button that will refresh a schedule when hit.

    This way you can even consider reducing the automated refreshes schedule and offer a better-personalized experience to end users.

    Blog Data Refresh Clicdata

    3. Consider heavy datasets

    Working with millions of rows and dozens of columns can become challenging because it can take up to minutes to proceed with the refresh. Long refresh times need to be considered in the Schedules set up as well.

    Always evaluate how long a dataset will take to refresh before setting up a 1-minute cadence schedule! Keep track of the load time using the task logs.

    Leave time for the refresh to be finished before running the next schedule.

    Blog Manage Data Refreshes

    Take advantage of the Busy Days / Time graphic in the scheduler to aim for lower activity times during the day to ensure quickest possible refresh time. Click on a day to drill down to minutes.

    At 2:02 AM UTC very few refresh jobs are launched.

    Blog Data Refresh Management

    4. Keep your workspace clean

    Group multiple data refresh tasks that feed into 1 dashboard or topic into one Schedule. This way, all the data will get refreshed at the same time, and consequent data combinations will be correct.

    In an e-commerce website case scenario, update Orders and Customers data at the same time, say every 10 minutes. Products can be updated separately, for example, once a day, unless the catalog changes at a quicker rate (on a marketplace website for example, where merchants feed the catalog continuously).

    On the dashboard side

    Finally, let’s not forget to look at the Dashboards after all the optimization work on Data and Schedules!

    We’ve collected some bests practice to help you save time while building and maintaining them, but also to build quick loading visualizations.

    1. Calculated metrics

    When creating calculated metrics always consider if you could create them on the Data side, using ClicData’s ETL, e.g. by adding a calculated column to a View. If yes, choose this option for performance’s sake.

    This will reduce calculation time when loading the dashboard, even more, if you cache the View.

    2. Use SQL in calculated columns on a dashboard

    Did you know that you can use SQL in your dashboard? Now you do!

    In case scenarios of calculations that do not need conditional filtering, simply write your calculations as you would do in a calculated column on the Data side. Add single quotes around the SQL formula.

    For example, replace:




    Bonus when using SQL, the formula is evaluated in the context of the widget, with its categories, series, and filters’ setup. Using the DataAggregate function you would need to include contextual filters to the formula.

    Learn more about SQL, DataAggregate and other contextual formulas.

    4. Prefer filtered widgets to All data displayed

    Set up widgets to be filtered by default to the smallest Dataset necessary.

    Loading all data in each widget will take longer to display than loading already filtered data. Only when selecting different values in the filter, will the corresponding data be loaded and processed.

    Blog Filtered Widgets Clicdata

    An elegant way to pre-filter data while providing a good user experience is to default filter to a User parameter, for example via a dropdown list that other widgets depend on.

    4. Smaller dashboards with interactions

    Avoid supersonic dashboards displaying all your KPIs at once.

    Consider building a set of dashboards linked to each other using buttons, providing a website-like experience. You can create a drill down report series using this method.

    Experience this with the live Sales dashboard template navigation menu built out of designed Button widgets.

    Blog Drill Down Dashboards Clicdata

    5. Dashboard formula

    Dashboard level formulas are useful when handling values or metrics used repeatedly across widgets, for example for filtering purposes.

    Let’s consider the case scenario of multiple widgets that are all filtered on a dynamic date, the first day of the current Month. Rather than adding this filter formula to each widget, create a dashboard formula and refer to it in widgets.

    The formula will be evaluated only once.

    Blog Dashboard Level Formulas

    Optimize Your Data Modeling With ClicData For Better Dashboard Performance

    There are a LOT of ways to optimize performances in ClicData. Creating a robust data model that will help you balance between best use of storage, efficient refresh schedules management and dashboard display time will make all the difference to your daily work, your end users and probably your subscription price.

    Start wherever it makes the most sense to you! Do the easiest optimization work first and gradually move forwards step by step to the toughest part.

    If you feel you need advice to make the best choices in your business scenario, don’t hesitate to reach out to our Support team or via the tickets system and Support chat. We will be happy to help you optimize your data model!