Time-series analysis using DAX

For quite some time, I'm facing questions that have a common theme, time. Some of these questions ask for a DAX statement that computes a Year-To-Date value or how to create a calendar date table. And some of these questions are similar to the ones below

  • What is the previous value?
  • What has been the most extended sequence of succeeding days during a given period?
  • What is the number of gaps during a given period?
  • Is the length of gaps growing towards the end of a period?
  • How can I use DAX to implement Auto-Correlation to discover seasonality?

This will start a series of articles focussing on time-series analysis using DAX.

This series of articles demonstrates how DAX can implement algorithms used for time series analysis (like auto-correlation) and methods that are "just" time-related but are more complex using DAX. If you are not familiar with the phrase time-series analysis; here you will find an introduction to this topic

Please be aware that this series of articles is using DAX, not to say Power BI. For this reason, some DAX is complex in comparison to a single line of code of Python. Where Python or R can use libraries providing a function wrapped around complex math, DAX is mainly used to create measures extracting insights from large amounts of data stored in a tabular format. For this reason, sometimes it's challenging to transform time series algorithms into DAX. The first articles in this series are not implementing time series algorithms; instead are introducing techniques that will be used in later writings but provide solutions on its own.

 

From a personal experience, I can tell that the effort to implement at least some of the described measures is worth the effort as they add powerful methods to the Business Analyst's arsenal for retrieving insights from a dataset.

 

Before I delve into the DAX, I want to introduce some definitions. These definitions might be similar to definitions on Wikipedia or not. Still, these are the definitions that help me tackle the challenges that are coming with the questions from above.

some definitions

I will use a little chart to provide context, as I think this helps to understand the definitions better.

The above picture visualizes a time series that spans from the 1st of January 2020 to the 6th of February 2020 with a single measurement (quantity). The time series comprises six days. The granularity of the time series is the day. Using DAX, it's crucial to distinguish between the units of the calendar table and the units inside the fact table. For this reason, I call the units of the fact table event dates. When I'm referring to the dates of the calendar table, I'm talking about calendar dates.

 

If a time series contains more than one measurement, I call this time series a multi-variate time series.

 

I call the time series depicted in the above screenshot - a naïve time series. Naïve because of the gaps that exist in the time series. The picture below visualizes these gaps.

It's essential to be aware of the fact that more advanced algorithms expect a complete time series.

 

A time-series records events in chronological order; not each time series uses dates or timestamps for the ordering of events. It's only important that a given process orders the recorded events.

The Articles

This will start a series of articles, this first article it tackling a common problem using DAX. Even if this problem is not a problem solved by time-series analysis, the approach used will introduce some DAX techniques that will become handy implementing more advanced algoriths.