The following steps show how to create a new column in a table using existing custom function code. This works in Power BI as well as in Power Query in Excel:
Today I’ll present an adjustment to the Text.SplitAny – function in Power BI’s query editor or Power Query. The native function takes a string as an input and splits the text by every character that is contained in the string. This seems fairly unusual to me and I haven’t used that function very often.
But what I have come across fairly often is the requirement to split a string by a bunch of different (whole) strings (instead of single characters).
Recently I came across an interesting request where someone wanted to un-cumulate their quarterly YTD-figures (green) into their single quarters values (red) like so (“Unravel cumulative totals”):
To retrieve this value, one would have to start with the first value in the year. This is also the value of the first quarter, but for the 2nd quarter, one would have to deduct the value of the first quarter from the cumulative value of the 2nd quarter. So basically retrieving the previous cumulative row and deduct it from the current cumulative row. Do this for every row, unless it’s the start of the year or belongs to a different account code in this example:
I must admit that I had more than one unsuccessful attempt to try to fully understand how the List.Alternate-function works. What helped me at the end, was the function Table.AlternateRows. It pretends to be similar to List.Alternate, but holds some surprises that I will uncover in this blogpost:
How Table.Alternate works
Say I have the table below and want to retrieve just the letters that appear in every 2nd row:
I find the dialogue that appears very helpful and intuitive:
It clearly is a removal operation and here I want to remove the 1st row from my table (“1”), and just one at a time. Also want to keep just one row (“A”) before the next one is removed (“2”) and so on.
In the formula bar, this step will be translated into this M-code:
If you would have expected it to be translated to: Table.AlternateRows(Source,1,1,1) instead, you might have forgotten that the M-language in Power Query starts to count at 0, so the first row to remove is expressed by the 0 here.
List.Alternate should work similar
So if my input is a list instead of a table like below, I should expect a similar result than the sample above if I tweak the code a bit, shouldn’t I?
But hey: What’s wrong here? Not a single element has been removed from the list !!
So let’s have a look into the documentation:
and compare it with the Table.AlternateRows documentation:
Hm – at least we have one match here: The “offset” parameter is included in both functions. But it is the first (number) parameter in the Table-function and is at the last position in the List-function. So let’s move it around then like this:
There we are 🙂
So the order of the function parameters is different here. Also the other parameter names are different and their description. I find them much easier to understand in the Table function and of course, the function dialogue there helps to understand what shall happen as well.
Enjoy & stay queryious 😉
When you merge tables with distinct keys in Power Query you will get the same result than the VLOOKUP-function in Excel returns (if this is new to you, check out this article for example: https://www.myonlinetraininghub.com/excel-power-query-vlookup) .
But how to retrieve only the result of the first row, if the lookup-table has multiple rows with the same key?
Say you have a dimension table for products:
and a transaction table with multiple entries per product:
The task is to create 2 additional columns in your dimension table. One to show the first price at which the product has been sold and the other one the corresponding first date:
If you merge the transactions to the dimension table and expand it, you will end up with as much rows in the dimension table as there are in transaction table.
So how to retrieve only the elements of the first row of the matching tables? I’ll show you 2 different methods:
Solution 1 – Tweak the aggregation code
This is very quick to implement if you just want to return one or a few columns from the lookup-table: In the dialogue where you usually expand the columns, check “Aggregate” instead and click on one of the suggested aggregations for each column that I’m interested in (I simply ignore for a moment that these are not the aggregations that I actually need):
Now I tweak the code in the formula bar like so:
Replacing the default aggregations by what I need (in red: List.First) and adjusting the column names directly in that command (in green: just to save one manual step later).
To avoid long query durations on large tables, you can transform the key column of the dimension table to a real key column, like Chris Webb has described here: https://blog.crossjoin.co.uk/2018/03/16/improving-the-performance-of-aggregation-after-a-merge-in-power-bi-and-excel-power-query-gettransform/
Solution 2 – Add a column that selects the whole desired row
If you want to retrieve many more columns from your lookup table, the method above can become a bit tedious. Then it might be easier to add a column, that grabs the whole first row instead: Table.First would do that job:
Then simply expand out all fields that you need.
You can use many different selection operations with this technique: So List.Last or Table.Last would give you the latest prices for example. This would actually be a more realistic use case here … and is the reason why I didn’t solve the original problem with just removing duplicates 😉 .
Enjoy and stay queryious 😉
When you import Excel sheets who have empty leading or trailing columns and rows (showing null-values), you can substantially improve the complexity and speed of your import process with a simple trick:
Remove the reasons for the empty trailing rows and columns 😉
Usually, when you import data from an Excel sheet, Power Query will automatically detect the used range in a sheet and will just return those rows and columns who have content in it. So how can it come that in some cases, additional rows or columns are returned who have nothing but empty values in them?
The reason for it can be cell formatting of empty cells. They often occur in old workbooks where cells have been deleted. These cells will be returned with a null-value during the import process with Power Query. See this blogpost for more details of potential pitfalls that come with it.
The “Inquire” Excel Add-On lets you clean any excess cell formatting. After you’ve executed this command, Power Query will not import any of those leading or trailing empty rows or columns any more. Often this will reduce the file size of the Excel files dramatically as well.
You will benefit from:
- simpler query logic
- potentially huge improved import speed, due to the reduced file size
Enjoy and stay queryious 😉