Machine Learning with M in PowerBI and Excel

Very often I have thought about trying M instead of R for machine learning problems in PowerBI. Not only because I’m such a big fan of M, but also because we don’t have the R-integration in Excel (yet?).

Leila Etaati’s brilliant series of how to use R in PowerBI for KNN-prediction (nearest neighbourhood) finally kicked this off. In order to trigger some thoughts I have structured the code in a way that resembles the R-structure. So the core M-code looks like this:

KNN in M

 

Where this sits in a function that you feed with the following parameters:

Function Parameters

 

In there, 2 functions are called, like in the R-code. While the functions already exists in R and you just have to load the necessary packages, in M we don’t have these functions (yet), so I had to build them:

Normalizing the table:

Evaluating the nearest neighbour-label:

I also added some comfort-features: The k-value will be calculated automatically and you can enter a %-value for the split between training & test-data.

Key-findings:

M has all it needs to calculate the results, but the performance can be a pain. To my understanding so far, this is mainly due to the fact that it will call the sources multiple times. Unlike in SQL-server for example, the execution plan is hidden and we also don’t have stored procedures which enable us to de-activate the re-evaluation of the execution plans with every data refresh.

While I see the point in not re-inventing the wheel, there is an aspect of how many languages we are expecting the PowerBI-users to learn. Just a thought.

File to download:
zipped pbix: KNNR2-1.zip

Enjoy & stay queryious 🙂

Use Slicers for Query Parameters in PowerBI

Reading Rob Collie’s latest cool blogpost on how to retrieve slicer selections in Power BI, I couldn’t stop thinking of how awesome it would be, if we could use this technique to pass slicer selections as query parameters to the M-queries in the query editor. Not only would we have a very convenient user interface, but – what’s actually more important at the moment – we could pass multiple values as parameters to our queries, as this is not possible at all currently:

But how to fetch them? Rob’s post simply uses cross-filtering to show the values in a separate visual. In Excel we have cubefunctions where we can pass the slicer(-selection) as a parameter. Igor Cotruta, who is describing beautiful PBI-hacks on his blog here, kindly helped me out on this: “Via DMVs. Check $system.discover_sessions for the field sessions_last_command”. This worked perfectly into the following function, in which you just have to pass the name of the measure as a parameter:

Code beautified using Lars Schreiber’s Notepad++ Script: http://ssbi-blog.de/technical-topics-english/power-query-editor-using-notepad/

 

Make sure that you have used that measure on one of your visuals, as otherwise the function cannot harvest it. Also you have to first save the file and then push the refresh-button in order to trigger the correct refresh. The above function sort of “reads the current PBI file from outside”, so it will only see the saved version.

When you do the first refresh, a dialogue will pop up, where you just have to accept the default values like this:

The example in the file below fetches temperature data where every selected year will create a unique URL and the results of all those calls is consolidated into one table. But of course, this technique can also be used to pass multiple parameter values to SQL-commands or others.

A final note: The query to extract the slicer parameters from the DAX-statement is not particularly robust and you might have to adjust it, if your slicer-selection-strings contain special characters.

Download for logged-in subscribers SlicerParameter2.zip

Enjoy & stay queryious 🙂

Guide for switching Signs in Power BI and Power Pivot (bypassing Unary Operators in DAX)

In finance & accounting, you very rarely report the figures with the signs of their source systems, but switch (certain) signs according to different needs. Instead of using unary operators for it, I’ll present an easy and dynamic way for it in Power BI and Power Pivot using DAX. It will cover the following 3 main scenarios:

  • 1_SwitchAll: All signs are switched (red)
  • 2_SwitchExpLiab: Expenses and liabilities are switched back to their original values (green)
  • 3_BWT_Indiv: Only the main figure for expenses (or liabilities) carries a minus, all following positions specifying the expenses are (principally) reported as positives (blue)

 

Switching signs in Power BI and Power Pivot without unary operators

I’m using the sample data from this article but changed the source-data to a double-bookkeeping structure. There signs are used and the transaction entries in your ledger table always add up to zero. This is a method that prevents errors when posting and can also be used to prevent errors in reporting. If you keep the signs in your reporting system, all you have to do is add up the relevant figures and the returned (absolute) figures will always be correct. If you have read my previous articles on Easy P&L, you have seen this method in action: No minus-operation there, just a simple stupid adding of all accounts who fall into several (sub-) total categories via the bridge-table.

The Account-table also contains of (sub-) totals and the column “AccountType” shows if the positions are regarded as Turnover (Revenue) or Expenses:

Table “Accounts”

1_SwitchAll

My values on “1_SwitchAll” corresponds to “FinalValue” in the article above. The revenues come from consultancy and coursed provided. But the revenue for courses don’t just consist of attendee rates, but the costs for catering and paid instructors shall be deducted (highlighted in yellow). So the “good” numbers that contribute to cash in your pocket shall be reported without a sign and the “bad” numbers that result in an outflow of cash shall be reported with a minus. Within the expenses category, the costs carry a minus and the travel refunds (highlighted in orange), which are cash positive, are reported as positives.

2_SwitchExpensesLiabilities

Another requirement that is often used for balance-sheet-reporting or reports that only report on cost-situations, require that the costs or liabilities are reported without signs. … Principally, because the reimbursements/cost deductions shall be reported with an opposite sign (to show the adverse effect to the cashflow). This is what “2_SwitchExpLiab” shows (not covered in the article).

3_BWT (“BossWantsThat”)

Last but not least comes a typical “BossWantsThat”-requirement: Basically some strange stuff that you just have to deliver. Here the main categories “Revenues” and “Expenses” shall be shown with the signs that reflect the cash-direction, but all specifications that follow below shall be reported without signs (again: Principally, because positions with opposite cash-effects than the main category shall carry inverted signs).

Reporting techniques covered with this approach

Read more

Blending data in PowerBI like in Tableau

Today I came across a question in the PowerBI-forum if blending data was possible in Power BI like in Tableau. Although I wouldn’t necessarily recommend it, it’s definitely is a nice challenge. So the following function will interlace the rows from 2 tables like the blending-function in Tableau does. Just that we cannot use any aggregators on the attributes and are not able to use measures, as this takes place in the query-editor.

In our example we have a table with actual figures and one with budget figures:

We want to add 2 columns from the budget table to the actual table: “Amt” and “Qty” (red). Where there’s no match of budget – figures with actuals, there need to be added rows which hold only values from the budget figures (yellow):

Blended data like in Tableau

So we could do a join in full-outer-mode, but then we would need to find a way to put the date- and AccountNo-values into the existing columns of the actual figures. Instead we will identify those rows who need to go below the actuals and then do a join in left-outer mode just to add the values of the 2 new columns.

You need to feed this function the following parameters:

  1. Name of the primary table (“Actuals”)
  2. Name of the secondary table (“Budget”)
  3. Key column names of the primary table (“Date”, “Account”)
  4. Key column names of the secondary table (“Date”, “AccountNo”)
  5. Column names for the value columns (“Amt”, “Qty”)

Table.BlendRows

 

File with sample: BlendDataTableau.zip

 

Dynamically flatten Parent-Child Hierarchies in DAX and PowerBI

If you use DAX to flatten Parent-Child hierarchies you will end up with a table that has a static number of columns (like described here). If you need a dynamic solution instead, which creates just as many level-columns as there are needed for the current data, you can use DAX’s helper-tool Power Query (or Get Data in Excel) or the query-editor in PowerBI, which uses the language M.

Another advantage of this solution is that you can script the table creation in one step (only flaw: You still need to manually adjust your hierarchy though): But it saves time in creating the table, especially if you have many levels.

2 simple steps

  1. copy the following function,
  2. add a new step to your current table where you call this function, filling in the following parameters:
    • table name (which is the name of the previous step in your M-query)
    • name of the column with the child-key
    • name of the column with the parent-key
    • name of the column who’s values shall be shown in the levels (can also be child-key)

 

Call fnFlattenPCHierarchyFunction

 

And this is the code, which you can also download below:

Read more

Automatical or Bulk- Rename Columns in Power BI and Power Query

Edit 7th Feb 2017: Friendly reader Roknic pointed out in the comments below that there’s actually an existing function for it in M: Table.TransformColumnNames 🙂

So the first of my example below would actually look like this:

Table.TransformColumnNames(Source, each Text.Replace(_, " ", "_"))

But still keeping my original post here, as the transformations in them might help for other use cases:

If you want to rename all of your table’s columns with a common rule, like “replace all spaces by underscore” or just “delete all spaces”, check out this easy method:

The above formula will replace all spaces (” “) by underscores (“_”).

How does it work:

The 2nd argument in the Table.RenameColumns-formula is a list of lists, just like in Table.TransformColumnType from this article. So we apply the same technique here: List.Transform transforms a single element from a list into a list-item, whose 2nd argument will be calculated with a Text.Replace-function.

Variations:

Rename Columns Variations

Only replace FirstN or LastN elements from the column names:

Read more

Conditions in FirstN, LastN and other xN-functions in M, PowerBI and Power Query

Today I discovered that we can use conditions in many of the N-selecting functions where one/I would normally expect just a number-expression for the N:

Table.RemoveFirstN( table as table, optional countOrCondition as any)

So apart from being able to select a certain number of rows to be removed, we can pass a condition (as function). This condition will iteratively be checked for every row in the table (from top or bottom) and as long as every (next) step returns true, the resulting range will be removed. So as soon as one row breaks the condition, the process will stop.

I find that totally awesome, as we can now remove all top-rows who have an empty field in Column3 like this for example:

Table.RemoveFirstN(<MyTable>, each each (_[Column3] = null or _[Column3] = “”))

Yes, this will remove the first sequence of consecutive nulls in the table. So all other rows with nulls in the table coming later after a non-null value has “broken in”, will remain.

This is the list of function, where you can use this M-agic:

Enjoy & stay queryious 🙂

Comfort Functions for Easy Profit & Loss statements in Power BI and Excel – Part2

Here comes some long awaited comfort functions for part 2 of my easy P&L series. In the first section I’ve presented the general principle on how to work with a structure using an accounts-group-table. Today I will present 2 alternatives to define the reports without specifying single accounts. So if a new accounts are added to the chart of accounts, you don’t have to adjust your report definitions: Just make sure that to fill in all the fields in your account-group-table and you’re ready to go 🙂

No need to specify single accounts

So you only need to adjust your report definitions if you add new group items. If that’s still too much, take the 2nd solution, which will even eliminate that requirement:

  1. Individual Account Layout: Just define each subtotal and determine for which subtotals single accounts shall be shown

No more specification of individual accounts

How to use it:

How to use Individual Report Layout

2. Ultrashort Account Layout: Further simplification of just defining the groups (hierarchy) that shall be shown (with option to filter on one of them)

No need to define individual group items

How to use it:

How to use Ultrashort Layout

So these 2 different layouts will both produce the same reports incl. all accounts – just like in the first example. So you can choose which layout-style suits you best – actually, you can use all 3 in parallel. You just have to make sure to grab your pivot-rows from the correct tables and in Excel to grab the matching measures, as they all have their own bridge-tables (which need to be used in the measures):

 

How it works

Read more

Easy Profit & Loss (and other account) statements in PowerBI and Excel – Part2

Welcome to part 2 of my series of easy Profit & Loss and other account statements in Power BI and Excel. In the first part I introduced the general principle of creating asymmetric shaped reports who use just one measure per column (you should have read this article in order to understand this post here).

How the technique works

This technique capitalises the aggregation power of the Vertipaq engine and creates a bridge-table between your DimAccount-table and the ReportsAccountsLayout-table. In there for every line of your report, all accounts that belong to the (sub-)totals are matched (“AccountsAllocation”). This table can get very long, but the engine can handle this easily:

Different use case: Account-groups-tables

In the first example we’ve worked with a chart of accounts, which had a parent-chield-hierarchy defining all the subtotals of the report. In this example we’re working with a different setup, using the good old DimAccountsGroups-table. Just one row per account and the columns are coming in pairs, containing the group-criteria and the sort-order for the report:

Individual Report-Layout

We also need a second table (ReportsAccountsLayout) that holds the definitions of the report-layouts like this:

Read more

Incremental Load in Power BI using DAX UNION

This article describes the latest workaround for incremental load in PBI (thx to Taylor Clark for stressing this out!). It’s not very dynamic, as it doesn’t automatically load the difference to the existing data. Instead you have one query that contains your old data (which will be kept) and another query that grabs all data that comes after the last item from your old data. But at least it’s a technique that works without a hack:

Incremental Load Process

 

1 Create “Old Data”: “DontRefresh”

So it’s up to you to split up your long table or web-load activities and load your “old” stuff into one query (“DontRefresh”), perform your transformations and then load into the data model once. Then go back to the query-editor and disable the option “Include in Report Refresh”.

2 Create “New Data”: “Refresh”

Then take the cut-off-filter-criterium and use it to define the load of the new data that will subsequently be refreshed (“Refresh”). Transform your transformations from the first query into a function to make sure both tables have the same structure and load it to the data model (leaving the default load options to refresh).

3 Create table using UNION in DAX

In the data model, you create a new table that appends both tables to each other (and hide both input-tables from client view):

Create table in DAX using UNION

Another reason why this is not ideal is that fact that you cannot perform data transformations in the query-editor that iterate over the whole table. So I really hope that incremental load will once be a native functionality in PBI. Please vote for it here, as Microsoft prioritizes many of its activities on customer feedback: I vote for incremental load

Just picked up a useful tip from Mimoune Djouallah, to use a syntax like this:

union (summarize(table_current, field1,field2),summarize(table_history,field1,field2))

Which highlights the difference between the Append-command from M and the Union from DAX: The Union function requires the columns to have the same order in your table.

4 Why not use Append in the query-editor instead?

Another drawback of the current implementation is a somewhat unintuitive behaviour of queries which have been set to “Don’t include in Report Refresh”: As a standalone-query, they will behave as expected and not refresh. But once you reference them by a separate query or within an append-operation, they will refresh their results. So beware of this potential trap!:

Unexpected Behaviour Warning

 

Link to file: IncrementalLoadDaxUnionV2.zip

csv-sample data: https://www.dropbox.com/s/fovsg2xzclfqrzh/ImportData.csv?dl=0

So don’t forget to vote and stay queryious 😉