Very often I have thought about trying M instead of R for machine learning problems in PowerBI. Not only because I’m such a big fan of M, but also because we don’t have the R-integration in Excel (yet?).
Leila Etaati’s brilliant series of how to use R in PowerBI for KNN-prediction (nearest neighbourhood) finally kicked this off. In order to trigger some thoughts I have structured the code in a way that resembles the R-structure. So the core M-code looks like this:
Where this sits in a function that you feed with the following parameters:
In there, 2 functions are called, like in the R-code. While the functions already exists in R and you just have to load the necessary packages, in M we don’t have these functions (yet), so I had to build them:
Normalizing the table:
Evaluating the nearest neighbour-label:
I also added some comfort-features: The k-value will be calculated automatically and you can enter a %-value for the split between training & test-data.
M has all it needs to calculate the results, but the performance can be a pain. To my understanding so far, this is mainly due to the fact that it will call the sources multiple times. Unlike in SQL-server for example, the execution plan is hidden and we also don’t have stored procedures which enable us to de-activate the re-evaluation of the execution plans with every data refresh.
While I see the point in not re-inventing the wheel, there is an aspect of how many languages we are expecting the PowerBI-users to learn. Just a thought.
File to download:
zipped pbix: KNNR2-1.zip
Enjoy & stay queryious 🙂