· 

Using Power Query to convert an ip4 address into a numeric value

Recently I was challenged by the following question: How can I determine the region of a certain IP address.

Down below you will see a little screenshot of the underlying tables. The next table shows some ranges, a region is determined by an starting ip address and an ending ip address.

As you may have noticed there is an overlap of ip addresses in the region "This" and "Another This".

 

The next table shows the IPs that need a region to unleash more analytical power.

Please be aware that I added the "comment" column, just to get a hint what I would expect after adding some "data quality" measures.

 

So basically my idea behind this, is to convert the well formed IP4 addresses into a numeric value and then use some DAX  statements to lookup the region. As you may already have noticed, there are two ip ranges with overlapping addresses, "This" and "Another This". This will be treated using some data quality measures.

 

Converting an IP4 address to a numeric value

An IP4 address is build by 4 octets (each representing a numeric value between 0 and 255) so to create a unique numeric value the following formula can be used:

o1 * 2563 + o2 * 2562 + o3 * 2561 + o4 * 2560

 

For a simple example let's assume there are these 2 ips:

1.2.3.4

127.0.0.1

 

The above formula will create the following numbers

To create a numeric value from an ip4 address the idea behind an algorithm will be something like this:

Create an additive sum by iterating over each octet from the ip, during each iteration the numeric value representing the octet is multiplied by 256n. n is derived by the number of iterations.

 

Basically this can be achieved by "simply" using a little custom M function in Power Query.

 

If you are not familiar with custom M functions, this will gets you started:

http://radacad.com/writing-custom-functions-in-power-query-m

 

Basically there are these steps within the function body

create a list of octets from the ip

iterate over the list of octets and doing the maths from above

create a cumulative sum from the operation performed inside each iteration

 

The first step can be easily achieved by using the M function Text.Split, the following line of M code creates a list called ListOfOctets

ListOfOctets = Text.Split("127.0.0.1", ".")

 

Unfortunately there is no simple FOR … NEXT or DO … WHILE available in M. But there are some functions that can be used to iterate over "things".

 

The function uses two lists, the lists of octets and a simple list just containing the numbers from 0 to 3.

 

One of the very powerful but somehow mysterious functions inside M is the function called List.Accumulate

If you are not familiar with functional programming the short description provided by the official documetation can be somewhat frightening.

 

What the description says is, the following, this functions iterates over a list (the 1st parameter) and passes an initial value (the 2nd parameter) to an accumulator function, the 3rd parameter.

The 3rd parameter of the sample function looks somewhat awkward

(state , current) => state + current

Even without defining a variable that is called state or current the example works flawless if used inside Power Query as a function for a custom column.

It's not necessary that both variables have to be named state and current, you can name to your liking.

For the first iteration the following is happening:

The initial value (the 2nd parameter of List.Accumulate) and the first element of the list  are passed as parameters "state" and "current" to the function that simply adds 1 (current) to 0 (state).

The result is stored to the parameter  state and will be increased by the value of current in the 2nd iteration and so and so on …

 

Let's have a closer look at this M snippet

IPNumber = List.Accumulate(

        ListOfIndexItems, 0, (state, current) =>

state + Number.FromText(ListOfOctets{current}) * Number.Power(256, 3 - current)

    )

 

The function List.Accumulates iterates over the list "ListOfIndexItems". This list just contains 4 values, namely the numbers 0, 1, 3, 4.

 

Assuming the ip number that has to be converted into a numeric value is 127.0.0.1

Here is what happens during the

1st iteration

The following is passed to accumulator function

state: 0 (the initial value, the 2nd parameter of List.Accumulate) and

current: 0 (the 1st value from list ListOfIndexItems)

are passed to the accumulator function.

 

Inside the function the following happens:

0 (the current value of state) +

127 (retrieved from the zero-based list ListOfOctets using current) *

256(3-0)
= 2130706432

 

Here is what happens during the

2nd iteration

The following is passed to accumulator function

state: 2130706432 (the result of the accumulator function of the previous result) and

current: 0 (the 2nd value from list ListOfIndexItems)

are passed to the accumulator function.

 

Inside the function the following happens:

2130706432 (the current value of state) +

0 (retrieved from the zero-based list ListOfOctets using current) *

256(3-1)
= 2130706432

 

 

Here is what happens during the

4th iteration

The following is passed to accumulator function

state: 2130706432 (the result of the accumulator function of the previous result) and

current: 3 (the 4th value from list ListOfIndexItems)

are passed to the accumulator function.

 

Inside the function the following happens:

2130706432 (the current value of state)

+ 1 (retrieved from the zero-based list ListOfOctets using current) *+

256(3-3)
= 2130706433

 

Here you will find a pbix file that contains some sample data, the function another function that converts a given number to a ip4 address and also some DAX statements to check the quality of the source data.

 

Hopefully you will find this useful

Thanks,

Tom

Kommentar schreiben

Kommentare: 0