· 

The previous value

This article is the 1st article in a series of articles that will describe how more advanced algorithms and time-series analysis methods help solving complex problems inside Power BI. Using these algorithms with DAX can lead to some lengthy DAX statements. Still, from my personal experience, smaller snippets will help in a lot of areas.

This article will provide a different approach to tackle a common requirement, comparing the current value with its predecessor. Next to this, this article makes use of an

  • ordered structure - the path and
  • spans tables by using GENERATESERIES and finally
  • filters tables by using NATURALINNERJOIN instead of FILTER

All the above approaches help make DAX code faster, but do not promise to be the golden version, as always – it depends!

Two pbix files are coming with this article, one with an explanatory character and one is testing measures on a larger dataset

A common theme in almost every Power BI application is the analysis of change, like comparing October sales with September sales. When using a dedicated calendar table, a solution becomes simple most of the time, as the calendar table allows easy time traveling. Time-traveling becomes even more straightforward when using DAX functions like PREVIOUSMONTH and other DAX time intelligence functions.

 

Time-traveling inside DAX statements means changing the viewpoint and the current filter context regarding the calendar table. Sometimes the viewpoint allows us to see just a single day, the most detailed level in our calendar table, and then the viewpoint allows seeing periods of days, like all the days of the current month or even all the days that span the current year.

The two types of time-traveling

Time-traveling in DAX is about traveling between viewpoints and changing the point of view. There are two types of time traveling. The first type is using the calendar table alone for the navigation between various viewpoints. The DAX statement below is traveling from the current day to the day that is seven days back.

 

7-day change =

var __currentDate = CALCULATE( MAX(‘calendar’[date] ) )

return

CALCULATE( <numeric expression> , ‘calendar’[date] = __currentdate – 7 )

 

In the above example, 7 is a fixed offset. This offset allows us to determine the end of the journey (the time travel) based on the current viewpoint. I consider this journey as calendar-based, as it provides traveling along with the calendar table without considering event dates.

The second type is event-based. This type of journey is more adventurous and, for this reason, needs more preparation. I consider this type of time-traveling to be more adventurous because more advanced DAX techniques are necessary to determine all the viewpoints visited during the journey.

Consider the simple question, "How did the current value change in comparison to the previous value?" Calculating the change will be simple, but finding the viewpoint that is

  • immediately preceding the current one, or
  • the most recent event date of all the events happened before the current one (occurred in the past)

is simply difficult.

The following picture will make this clearer.

 

 

Time-traveling is a journey, and it's important to notice that the distance between the viewpoints that make the stations is not fixed.

Event-based time-traveling - the naive approach

To overcome the uncertainty in regards of the distance between the current viewpoint and the previous viewpoint a naïve approach can be used. This approach looks like this:

 

prev A 3 - naive =

IF( NOT( ISBLANK( [quantity (ms)] ) )

    , CALCULATE(

        LASTNONBLANKVALUE( 'DimDate'[DateKey] , [quantity (ms)])

        , FILTER( ALL( 'DimDate'[DateKey] ) , 'DimDate'[DateKey] < MAX( 'DimDate'[DateKey] ) )

    )

    , BLANK()

)

 

The picture below shows the result of this measure.

 

 

To understand the above measure, it's necessary to focus on the function LASTNONBLANKVALUE(… , …). The outer IF just prevents the calculation for dates without events.

LASTNONBLANKVALUE returns the value for the specified numeric expression (2nd parameter of the function) from the column specified as 1st parameter. LASTNONBLANKVALUE will order the column values in the current filter context. For this reason, CALCULATE is wrapped around LASTNONBLANKVALUE, removing the current viewpoint from the current filter context.

The measure "prev A 3 - (LASTNONBLANK / FILTER)" takes care of the technically correct, but the total line's maybe unexpected result.

 

I call this approach naïve because of its lack of planning the journey more appropriately. This lack of planning is costly. The picture below shows the cost of a lack of planning. 

 

 

The above metrics result from dragging the measure prev 1 LASTNONBLANK to the matrix visual inside the TSA - ContosoRetailDW - previous value.pbix

The above approach or similar ones using

  • TOPN(1 , , , DESC) or
  • var __prevDate = MAXX( FILTER( … , ‘fact’[event date] < __currenteventdate ) )

are obvious but costly.

 

The mentioned approaches are apparent (more or less) as there is no concept of ordered sets available in DAX. Talking SQL, DAX tables are unsorted heaps. At least, this is how it looks like to users.

Talking SQL, a self-join can calculate the previous values much more efficiently. Imagine there is a table that looks similar to the one below

 

rownumber | rownumber – 1 | dates | values

 

A table like the one from above, a join-condition like the following one

rownumber = rownumber – 1

allow a simple calculation of the previous value

The next picture shows this conceptionally:

 

 

The idea is simple, replace the naive approach using LASTNONBLANK with a self-join using the function NATURALINNERJOIN.

The next steps explain how to build a virtual base table derived from the filtered fact table "FactOnlineSales."

Performing time traveling, I consider integer math more simple than date operations. For this reason, I always add one or more integer columns to my calendar table. The INT function computes the integer from a given date, as shown in the next code sample that creates a simplified calendar table:

 

Calendar =

ADDCOLUMNS(

    CALENDAR( DATE( 2010 , 1 , 1) , DATE( 2020 , 12 , 31) )

    , "day as integer" , INT( ''[Date] )

)

 

The column Rownumber is essential for creating a join instead of using LASTNONBLANKVALUE. The column filters the virtual table down to the calendar dates present in the current filter context.

 

The Base table plays a significant role in improving the performance in comparison to the naïve approach. One crucial aspect, not just for the computation of the previous value but also for all time series analysis and beyond, is a sequence. My preferred solution to create and persist an ordered structure, a sequence, is using a path. If you are not familiar with this structure, you might want to get familiar with all the PATH… functions. A path is a string where the |-character (the pipe character) separates the path items.

The next picture shows all the available event dates in the context of all used calendar dates (measure: prev B 1.1 …, report: TSA - Simple time-series - previous value.pbix, sheet: Previous Value (Prev B) - NATURALINNERJOIN).

 

 

The table iterator CONCATENATEX creates a path using the |-character as the delimiter. It's also important to note that OrderBy_Expression[n] and order [n] are used.

The path is essential for the composition of the base table.

The next code snippet shows how the Base table (a virtual table) is created.

 

var abt =

        ADDCOLUMNS(

            ADDCOLUMNS(

                GENERATESERIES( 1 , allDatesCalendar_ValuesPathLength , 1 )

                , "DateIndex" , PATHITEM( allDatesCalendar_ValuesPath , ''[Value] , INTEGER )

                , "DateIndexPrev" , PATHITEM( allDatesCalendar_ValuesPath , ''[Value] - 1 , INTEGER )

            )

            ,"previousvalue"

                , var prevDateKey = [dateIndexPrev]

                return

                CALCULATE( [Total SalesAmount] , 'DimDate'[DateKey] = prevDateKey )

        )

 

What happens inside the code snippet from above is this:

  • GENERATESERIES spans a table where the number of rows equals the length of the ordered path, the values of the auto-generated column Value are automatically representing the row number (https://dax.guide/generateseries/)
  • ADDCOLUMNS (the inner one) adds two additional columns to the table created by GENERATESERIES
    • An index column "DateIndex" that represents the current day
    • An index column "DateIndexPrev" that allows referencing the previous date,  using index columns from inside the ordered path, identifying/referencing the previous date becomes simple
  • ADDCOLUMNS (the outer one) adds the previous value using the column value of "DateIndexPrev" as a filter on the calendar date.

The almost-final step uses NATURALINNERJOIN to perform the self-join or be more precise, filtering the virtual table in the current filter context.

The next snippet (taken from the measure prev B 1  - NATURALINNERJOIN ) is showing this:

 

var calendarDates_abt =

        //here the "left" table (the current datekey(s)) filters the virtual table (abt)

        NATURALINNERJOIN(

            SELECTCOLUMNS(

                VALUES( 'DimDate'[DateIndex] )

                , "DateIndex" , 'DimDate'[DateIndex] + 0 //breaking the existing lineage

            )

            , abt

        )

 

The function NATURALINNERJOIN has some intricate inner workings, hence the renaming initiated by SELECTCOLUMNS and breaking the data lineage to the physical column DateIndex, adding zero.

 

Finally, the table iterator function SUMX is computing the sum of the previous values.

 

The pbix "TSA - ContosoRetailDW - previous value.pbix" contains 2 measures that can be used with the larger ContosoRetailDW dataset:

  • prev 2.1 SUMX - (PATH/NATURALINNERJOIN)
  • prev 2.2 GROUPBY - (PATH/NATURALINNERJOIN)

Enjoy DAX!

Kommentar schreiben

Kommentare: 1
  • #1

    DAX JS (Dienstag, 03 November 2020 17:56)

    Wow nice work!