Advanced type detection in Power BI and Power Query: Table.ClassifyMixedColumnTypes

This is not a proper blogpost, just a quick share of a function I’ve created today which I think will be very useful for others as well:

Automatic type detection will assign only one type to a column, but sometimes there are multiple types in it and then the type “any” will be applied, which means no type at all. In these cases, often the type of the individual elements/rows in those columns are the key to the necessary ETL-transformation operations to be performed. So you would like to be able to write statements like this:

Table.AddColumn(PreviousStep, “Event”, each if Type.Is(Value.Type([MyColumn]), type text) then ThisEvent else AnotherEvent)

(reading: Add a column with a new Event that depends on if the type of the current row in MyColumn is text: then do ThisEvent else do AnotherEvent)

My advanced type detection function will identify and allocate different data types. It will try to apply a couple of type conversion operations and in case of success, apply the type to the individual cell/record field and in case of failure move on with the next tries. If none of the type conversions succeeds, it will return the original state (any). It takes the name of your table and a list of the column names on which it should be applied as the input-parameters.

(Table, ListOfColumnNames) =>
let
Table.ClassifyMixedColumnTypes.m = Table.TransformColumns(Table.TransformColumnTypes(Table, List.Transform(ListOfColumnNames, each {_, type text})), List.Transform(ListOfColumnNames, each {_ , each try Number.From(_) otherwise try Date.From(_) otherwise  try Text.From(_) otherwise _}))
in
Table.ClassifyMixedColumnTypes.m

So this function will apply different data types (expand as you like/need) within one column on a row-by-row basis – which is what I’ve been looking for quite a while 🙂

Thank you so much Bill Szysz for showing me how to use the List.Transform-trick in order to bulk-apply transformations!

If you’re as lazy as me, you could be tempted to pass “Table.ColumnNames” to the “ListOfColumnNames”-parameter – but this might slow your query down! (Guessing – not much practical experience gained yet)

A warning at the end, that this is of course error-prone – as some strings like “3.2” for example are not unambiguous and can get converted as a date although in the specific context should be numbers (and vice versa). So you should check properly and – if needed – perform some additional transformations before in order to guarantee the correct outcome. But at least in my case it did exactly what was needed.

Please share your thoughts/experiences with this function & stay queryious 🙂