Mincing Data - Gain insight from data

This site will be a container for all my musings about the analytical pipeline.

For this reason it will be necessary to define the Analytical Pipeline (at least define this pipeline from my point of view). From a general perspective (be aware that this is my perspective) five activities are necessary to build an analytical pipeline, these steps are

source

ingest

process

store

deliver


 The overall goal of an analytical pipeline is to answer an analytical question. To achieve this overall goal, different data sources have to be targeted and their data has to be ingested into the pipeline and properly processed. During these first steps the data ingested into the pipeline often has to be stored in one or more different data stores, each is used or its special type of usage, finally the result of the data processing has to be delivered to its users, its audience. Depending on the nature of the question, different types of processing methods and also different data stores may be used along the flow of the data throughout the pipeline.

 

I put a direction to these activities, but I also added this direction to spur your critical mind, because

  • source data is of course also stored somewhere
  • to target the source can become a challenge in the pipeline
  • to deliver processed data to the its audience can become a another challenge  if you try to deliver to mobile workforce
Successfully ingested data often gets processed more than once to answer different questions, and during its existence (inside an analytical data platform) this data will be transformed into various shapes, wrangled by different algorithms, and ingested into different "data stores".

For this reason I believe that these activities are tightly related, and the above mentioned sequence of these activities will just aid as a guidance.

 

I will use blog posts to describe how different activities are combined to answer analytical questions. In most of my upcoming blog posts I will link to different topics from the activities used in the pipeline. Each activity has its own menu and is by itself representing an essential part in analytical pipeline.

 

Hopefully this site will help its readers as much as it helps me to focus on each activity always knowing that most of the time more than one activity has to be mastered to find an answer to an analytical question.


Field parameters - the dawning of a new era

One of the most requested features (at least from my perception) has arrived with the May 2022 release of Power BI desktop. This feature allows the report designer to switch the content of a visual with ease.

As one picture can communicate a meaning better than a thousand words, the following picture shows what is meant by changing the content of a visual:

The above gif shows how the content of the axis changes depending on the selection in the slicer.

 

This is exactly what the new feature allows us to create: a list of fields (basically, this list of fields is a table created in the dataset). The list can contain columns and calculated columns or measures. At the time of this writing, only explicit measures are supported.

In the past, different solutions have been created by the community to fulfill the requirement of report users being able to change the content of a data visualization easily. Changing the content of a data visualization means (the list is not complete ;-) ):

  • Changing the content of the x-axis of a column chart, e.g., from product color to continent, as demonstrated in the animated gif above
  • Changing the content of a legend, meaning splitting the value into an additional category as in a stacked or clustered column/bar chart
  • Changing the numeric value, the measure, the Kpi (however you call this field type) used to compare items, show a trend across a timeline, visualize the contribution of a part-to-a-whole, and all the other things we do inside Power BI.

The solutions depicted here had a downside. Using bookmarks makes the report design more complex and, for this reason, harder to maintain. All of these solutions either had to make exhaustive use of the bookmark feature, or the creation of unrelated tables, or related tables (a little more tricky), and some DAX, and sometimes hell of lot more DAX. Using tables (related or not) always requires the writing of DAX. Nevertheless, adding tables to the analytical model is impacting the query performance.

Sometimes I created solutions with complex report designs and a performance penalty 😎

 

These days are gone, as the feature Field parameters no longer impacts the data model's design and allows a much more straightforward report design. There are no longer any performance penalties enforced by implementing the requirement "dynamic content of a data visualizations." All the solutions I created in the past will benefit from this new feature if this requirement is rebuilt using this new feature.

The subsequent chapters of this blog will explain the following topics

  • how to start
  • the generated DAX
  • a closer look at the "field parameters table" and some detailed observations
  • ideas that are beyond the obvious
  • how this feature is different from the "Personalize visuals" feature
  • some limitations (at least from my point of view)
mehr lesen 0 Kommentare