Machine Learning with M in PowerBI and Excel

Very often I have thought about trying M instead of R for machine learning problems in PowerBI. Not only because I’m such a big fan of M, but also because we don’t have the R-integration in Excel (yet?).

Leila Etaati’s brilliant series of how to use R in PowerBI for KNN-prediction (nearest neighbourhood) finally kicked this off. In order to trigger some thoughts I have structured the code in a way that resembles the R-structure. So the core M-code looks like this:

KNN in M

 

Where this sits in a function that you feed with the following parameters:

Function Parameters

 

In there, 2 functions are called, like in the R-code. While the functions already exists in R and you just have to load the necessary packages, in M we don’t have these functions (yet), so I had to build them:

Normalizing the table:

Evaluating the nearest neighbour-label:

I also added some comfort-features: The k-value will be calculated automatically and you can enter a %-value for the split between training & test-data.

Key-findings:

M has all it needs to calculate the results, but the performance can be a pain. To my understanding so far, this is mainly due to the fact that it will call the sources multiple times. Unlike in SQL-server for example, the execution plan is hidden and we also don’t have stored procedures which enable us to de-activate the re-evaluation of the execution plans with every data refresh.

While I see the point in not re-inventing the wheel, there is an aspect of how many languages we are expecting the PowerBI-users to learn. Just a thought.

File to download:
zipped pbix: KNNR2-1.zip

Enjoy & stay queryious 🙂

How to create a Load History or Load Log in Power Query or Power BI

If you want to collect your regularly loaded data without overwriting it or create a load log that writes the load activites with a timestamp into a table, you need to create a query that adds new lines to it’s own latest version.

In both cases you add a column that returns the current day and time using DateTime.LocalNow()

Image1

Load Log the PowerBI-way

Then using Power BI it’s fairly easy: Use the R-extension to perform an append query to the external txt or csv-file like described in this post (incremental load):

require(gdata)
write.table(trim(dataset), file=”your filepath & filename.txt”, sep = “\t”, row.names = FALSE, append = TRUE; column.names=FALSE)
plot(dataset);

Load Log the Power Query-way

In Excel’s Power Query it’s a bit trickier, because you cannot export your queries and therefore need to create the consolidated table in your workbook itself. And it must be a self-referencing query or to be more precise: a self-sourcing query: A query that takes its latest output as its new input, then appends the new data and overwrites the old result with the new one.

How do we deal with this? When we write the query we don’t have the result yet to be referenced?

Read more

Use R to export data from Power BI (and Power Query)

Edit 03 Aug 2016: With the July 2016 release you can now run your R-scripts directly in the query-editor (so no row-limit any more!). No need to go via the visuals. But this will limit you to export datasets from the query-editor, so no DAX.

Edit 22 Jan 2016: Currently this method is limited to 150k rows! It will drop all others without warning!

Edit 25 Jan 2016: 150k rows is the limit on all R-scripts at the moment (link). A warning sign will be shown on the image. Will create a new blogpost if this changes – so stay tuned (by following this blog, Twitter or LinkedIn)

With the December release, Microsoft enabled Power BI’s content to be fetched from the R-feature. So instead of plotting this data to the screen, could we just export it to a file?

Yes we can:

write.table(trim(dataset), file=”your filepath & filename.txt”, sep = “\t”, row.names = FALSE);

write.table has a lot of options, which are explained here. I’ve only used a few here: Exactly defining the location where the output-file shall be stored (file=), setting tab as delimiter (sep=”\t”) and skipping row names, which would be an index in our example here.

In addition to that, I had to get rid of trailing whitespaces that somehow sneaked into the table (trim(dataset)). Therefore the gdata-package is needed. So the full R-code looks like this:

require(gdata)
write.table(trim(dataset), file=”your filepath & filename.txt”, sep = “\t”, row.names = FALSE)
plot(dataset);

Here you have to make sure to turn the slashes in the filepath: So if your path looks like this: C:\Users\Imke\Desktop\Rfile.txt you need to write it like this: C:/Users/Imke/Desktop/Rfile.txt

Let’s take this as a start that will probably fulfill many users need to quickly export data from Power BI to simple text files. Further use cases I see are:

Read more

Power Query & Power BI are ideal learning paths from Excel to R

Reading Jen Stirrup’s great article about the learning path for SQL Server 2016 and R I ended up learning my first R pieces from this wonderful post that she has referenced. There Tony Ojeda beautifully describes how basic Excel-concepts are translated into R.

Well – I cannot speak of so much personal experience here – but these R-codes came easy as nothing before to me. Apart from Tony’s first class didactics, could this be due to what I’ve learned in Power Query over the last year?

I’ve added the comparison of Excel and R by the M-code from Power Query – putting it in the middle:

RandPowerQueryV3

Do you find the similarities as striking as I do?

Read more