Data Processing

Personally I remember a time where the only tool for data processing has been SQL. I used and still use SQL a lot to create columns in tables, eg. based on more or less complex subqueries. Also defining stored procedures and user defined functions, was and is still one of the methods to shape the data, meaning give it form and content so that it suddenly becomes information. The same can be said for using MDX (Multi Dimensional Expressions) in conjunction with MS SQL Server Analysis Services Multidimensional But it's not SQL, or MDX alone any longer. Since a couple of years, my toolset to wrangle, to munge, or simply put - to transform data, has a grown. Nowadays there is also DAX (Data Analytical Expressions), R, Python to process data, and also some other stuff that I will sooner or later write about.

 

 In the context of this site process does not only mean to transform source data, translating a value of zero to a more readable text like male, or using SQL windowing functions to rank values inside a partition of rows according to a certain business logic. And it's not just using MDX and DAX queries to calculate new values to populate a bar chart. It's also about applying advanced algorithms to a dataset, to predict future values, or cluster similar objects.

 

Process in the context of this site means, apply everything that is necessary to gain insight from data ingested into the analytical pipeline using the toolset mentioned above, and not that toolset alone.