Advanced type detection in Power BI and Power Query: Table.ClassifyMixedColumnTypes

This is not a proper blogpost, just a quick share of a function I’ve created today which I think will be very useful for others as well:

Automatic type detection will assign only one type to a column, but sometimes there are multiple types in it and then the type “any” will be applied, which means no type at all. In these cases, often the type of the individual elements/rows in those columns are the key to the necessary ETL-transformation operations to be performed. So you would like to be able to write statements like this:

Table.AddColumn(PreviousStep, “Event”, each if Type.Is(Value.Type([MyColumn]), type text) then ThisEvent else AnotherEvent)

(reading: Add a column with a new Event that depends on if the type of the current row in MyColumn is text: then do ThisEvent else do AnotherEvent)

My advanced type detection function will identify and allocate different data types. It will try to apply a couple of type conversion operations and in case of success, apply the type to the individual cell/record field and in case of failure move on with the next tries. If none of the type conversions succeeds, it will return the original state (any). It takes the name of your table and a list of the column names on which it should be applied as the input-parameters.

(Table, ListOfColumnNames) =>
let
Table.ClassifyMixedColumnTypes.m = Table.TransformColumns(Table, List.Transform(ListOfColumnNames, each {_, each try Date.From(_) otherwise try Number.From(_) otherwise try Text.From(_) otherwise _}))
in
Table.ClassifyMixedColumnTypes.m

So this function will apply different data types (expand as you like/need) within one column on a row-by-row basis – which is what I’ve been looking for quite a while 🙂

Thank you so much Bill Szysz for showing me how to use the List.Transform-trick in order to bulk-apply transformations!

If you’re as lazy as me, you could be tempted to pass “Table.ColumnNames” to the “ListOfColumnNames”-parameter – but this might slow your query down! (Guessing – not much practical experience gained yet)

A warning at the end, that this is of course error-prone – as some strings like “3.2” for example are not unambiguous and can get converted as a date although in the specific context should be numbers (and vice versa). So you should check properly and – if needed – perform some additional transformations before in order to guarantee the correct outcome. But at least in my case it did exactly what was needed.

Please share your thoughts/experiences with this function & stay queryious 🙂

Step2

How to analyse M-functions step-by-step?

This is a quick tutorial on how to analyse M-functions step-by-step.

Queries, that have been transformed into a function (1) can be invoked in the query-editor (2) and they will then return their output for you to check (3). But they will not show you how they did it. In the query-editor where you normally see the single steps and their individual output if you click on them, you will just see one line for the function and one for the invocation (4):

Step1

So how can we make them reveal all their secrets for us to see the quickest way?

Yes, we could edit the code and replace every parameter that has been passed. But this can get a bit tedious, especially if you haven’t written the code by yourself and don’t know where and how often the parameters are used. A very quick and safe way is to replicate the parameters within the let-expression by copy-&pasting like this:

AnalyzeM-Functions

That way you just define them once and don’t have to care about how often they are used within the following query. And it is very quick to do as you can just copy the parameter-expression and comma-separate your inputs.

Step2

I bet you want to check this out on Chris Webb’s latest blogpost here: DynamicDateSelections.zip

Enjoy & stay queryious 🙂

Multiple replacements or translations in Power BI and Power Query

A common task when cleaning data is to perform multiple replacements at once. When dealing with “translation tables” you might have come across Chris Webb’s solution using List.Generate to deal with this. This works fine – but might deliver unexpected results if you don’t consider it’s implications properly. Say we add lines like shown below, then we might expect the result as highlighted in yellow:

Image2

Multiple replacements using 2 different techniques

So hey – what happened here?: Our catamaran turns out as a bearamaran and the chair turns to chwater. Read more

How to hack yourself in Power BI (and Power Pivot?)

Reading Gerhard Brueckl’s post on how to visualize SSAS calculation dependencies reminded me of my post about a similar technique from December last year.

His solution has features that would do my version good as well:

  1. Directly connect to the model to be analysed without clumsy export of measures to txt via DAX Studio
  2. Including calculated columns? No!: Who does calculated columns in DAX in PBI? Do them in the query editor using M instead (more functions, better compression, easier merge of model to SSAS once needed)

So wouldn’t it be cool if we could just add a documentation page to our current model – “all in one” so to speak? Here you find how to “hack”-connect with Excel to your current Power BI Desktop-Model.

So what works with Excel should work with PBI as well – just that we need to connect via the query-editor, using M. And of course: As we’re hacking ourselves here (i.e. the file we’re currently working on), we need to save our changes in order to make them being shown.

Read more

PowerBI Best Practice: Views for non-database data

Marco Russo has shared bunch of very valuable PowerBI best practice tricks for designing your PowerBI model here: http://www.sqlbi.com/articles/data-import-best-practices-in-power-bi/?platform=hootsuite

One of them being never to directly import a database table, but to use a view instead as a kind of abstraction layer. It separates your business logic (that determines how the tables should be structured that will be passed into the data model) from your actual/incoming data (your source systems, whose table structure you cannot change and which might change itself over time).

This is really important in my eyes, as it will be only a question of time when your underlying data will change and the last thing you want to do is to edit all datamodels or even reports in your (multiple and possibly decentralized) PowerBI files!

But what do you do if your data don’t come from databases or you have no possibility to create views on them? Read more

TableTransformation

How to share M-code in PowerBI and Power Query

One of M’s advantages is that you can share M-code in textform and it will run instantly on a different computer. So no need to actually exchange a file. Provided that you pass the data on as well with your code, like you can see in this example.

But it would be a bit tedious to type in all the data manually – so let’s M do this automatically for us:

1 Automatically transfer a list into textform to share M-code

let
qList= {"This", "is", "a", "pretty", "short", "list"},
Source = qList,
CoreString = Text.Combine(List.Transform(Source, each Text.From(_)), """, """),
FullString = "= {"""& CoreString &"""}"
in
FullString

2 Automatically transfer a table into textform to share M-code

let
qTable= Table.PromoteHeaders(Table.FromColumns({ {"Column1" ,"This" ,"an" ,"shorter"}, {"Column2" ,"is" ,"even" ,"table"} })),
Source = qTable,
DemoteHeaders = Table.DemoteHeaders(Source),
ListOfColumns = Table.ToColumns(DemoteHeaders),
ConvertToTable = Table.FromList(ListOfColumns, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
CoreString = Table.AddColumn(ConvertToTable, "Custom", each Text.Combine(List.Transform([Column1], each Text.From(_)),""" ,""")),
FullString = "= Table.PromoteHeaders(Table.FromColumns({ {"""& Text.Combine(CoreString[Custom], """}, {""")&"""} })),"
in
FullString

Read more

How to handle M-code samples

If you use Power Query or the query editor in Power BI every step you take will be recorded and translated to M-code. You can share this code and it will run in the new environment – so no need to actually exchange a file.

1 Check out what the M-code does

In order to make this code-sharing easy for beginners, I’m sharing my code in a way that it will run instantly with the test-data provided within the code. So you just copy my code examples into the advanced editor like this (example given for Power BI):

Home -> External Data -> Get Data -> Blank Query -> Home -> Query -> Advanced Editor.

There you replace all the content in the window: Check existing content, paste and click Done.

PasteCode

That’s all, now check out how it works: On the Home-tab check Close -> Close & Apply. This will load the data to the data model and you can build your reports with it.

2 Check out how the M-code does it

Read more

HeaderImage

Select rows that have no empty fields using Expression.Evaluate in Power BI and Power Query

This article describes the use of Expression.Evaluate as a very helpful “swiss-army-knife”-method for your Power BI toolbox as it has many more use cases than the one described below. It lets you perform repeating tasks without using functions and can even replace recursive operations in some cases – but that’s a topic for a later blogpost. Today we start with a simple task:

If you want to filter rows only where none of the fields/columns are empty, you’d either pass not-null filters into every single column (which is tedious and not dynamic), transpose or add an index column, unpivot and filter from there (both might drag performance down on large tables).

Using Expression.Evaluate will create a dynamic approach that doesn’t force you to transform your source table nor to directly specify your column names: It will take in a text string and execute it as if you’ve been written it manually into the formula. If you manage to let this textstring be created dynamically according to the different tables you’re going to pass in – Expression.Evaluate will make sure that these individual statements will be executed accordingly.

So for a table containing Column “A” and Column “B” the filter statement would look like this: Read more