Transforming multiple columns at once with reference to existing column in Power Query

In this article I’m going to present a method about transforming multiple columns at once in a fast way. This method also allows you to reference columns that exist in your table already. As I have described in a previous article, this cannot be done using the native Table.TransformColumns function that will be applied if you do column transformations using the UI in Power Query. The function I am sharing here allows you to enter a list of column names to be transformed and a function that defines the transformation itself. So you have to be familiar with defining custom functions to use this approach.

Failed attempts

A very easy way to do such a task is to unpivot all columns that shall be transformed, do the transformation (once) and then pivot back. But this will only work with small datasets or datasets with little transformation in a performant way.
So my next approach to was to apply the Table.ReplaceValue-method from the before mentioned article and use it in a List.Accumulate function to be recursively applied to all affected columns. But that turned out to be very slow as well. At the end, I used a “drop and recreate”-approach that runs much faster:

Function for transforming multiple columns at once


let func =
// fnTransformMultipleColumns
let
fnTransformMultipleColumns = (
SourceTable as table,
FieldList,
TransformationFunction,
optional FunctionArgumentsList,
optional newType
) =>
let
FunctionArguments = if FunctionArgumentsList = null then {} else FunctionArgumentsList,
Source = Table.AddColumn(
SourceTable,
"**tempColumn**",
each Record.FromList(
List.Transform(
FieldList,
(l) =>
Function.Invoke(
TransformationFunction,
{Record.Field(_, l)} & {_} & FunctionArguments
)
),
FieldList
)
),
RemoveOldColumns = Table.RemoveColumns(Source, FieldList),
ExpandNewColumns = Table.ExpandRecordColumn(
RemoveOldColumns,
"**tempColumn**",
FieldList
),
RestoreType = Value.ReplaceType(
Table.ReorderColumns(ExpandNewColumns, Table.ColumnNames(SourceTable)),
Value.Type(SourceTable)
),
ApplyNewTypes = Table.TransformColumnTypes(
ExpandNewColumns,
List.Transform(FieldList, each {_, newType})
),
Result =
if Value.Type(newType) = type type then
ApplyNewTypes
else if newType = null or newType = true then
RestoreType
else
ExpandNewColumns
in
Result
in
fnTransformMultipleColumns ,
documentation = [
Documentation.Name = " Table_TransformMultipleColumns ",
Documentation.Description = " Transforms multiple columns of a <code>SourceTable</code> using the specified <code>TransformationFunction</code>. ",
Documentation.LongDescription = " Transforms multiple columns (defined in <code>FieldList</code>) of a <code>SourceTable</code> using the specified <code>TransformationFunction</code>. Optional parameters allow to pass addional function arguments and define a new type. ",
Documentation.Category = " Table ",
Documentation.Source = " www.TheBIcountant.com ",
Documentation.Version = " 1.0 ",
Documentation.Author = " Imke Feldmann ",
Documentation.Examples = {[Description = " ",
Code = " let
SourceTable = #table( type table [Product = Text.Type, Rate = Number.Type, Sales = Int64.Type, CoS = Int64.Type],
// Product| Rate| Sales| CoS|
{//——–|—–|——|—-|
{ ""A"", 1.2, 100, 40 },
{ ""B"", 0.9, 200, 70 } } ),
FieldList = {""Sales"", ""CoS""},
TransformationFunction = (currentItem, _) => currentItem * _[Rate],
FunctionCall = Table_TransformMultipleColumns(SourceTable, FieldList, TransformationFunction)
in
FunctionCall ",
Result = " #table( type table [Product = Text.Type, Rate = Number.Type, Sales = Int64.Type, CoS = Int64.Type],
// Product| Rate| Sales| CoS|
{//——–|—–|——|—-|
{ ""A"", 1.2, 120, 48 },
{ ""B"", 0.9, 180, 63 } } )
"]}]
in
Value.ReplaceType(func, Value.ReplaceMetadata(Value.Type(func), documentation))

Function parameters

  • SourceTable: Is the table you want to apply this function on
  • FieldList: Is a list of column names that shall be transformed
  • TransformationFunction: The function that shall be applied to the columns from FieldList. This function can have as many parameters as you need, but the first the first 2 parameters are mandatory and must hold these 2 items: The first parameter must represent the item to be transformed (one of the columns from FieldList) and the second parameter must represent the current record/row. From there on you are free to add further parameters that you want to use in your function. (Example see next paragraph)
  • optional FunctionArgumentsList: If your TransformationFunction uses additional parameters, you must pass them in here as a list.
  • optional newType: By default, the function will automatically restore the same types for the transformed column that they had before. But if the column transformation shall also change the type, you can define it here. Just specify a type.

Example

Starting from a table that has Sales and CoS, these columns shall be multiplied by the values in column “Rate”:

Multiple columns shall be transformed with a reference to an existing column

So for the first function parameter, you reference this table. For the second function parameter, you would pass in this list of column names: {“Sales”, “CoS”}.

The TransformationFunction for the 3rd argument for this will look like so:

(currentItem, _) => currentItem * _[Rate]

It has 2 arguments: “currentItem” stands for the columns that shall be transformed: “Sales” and “CoS” and the underscore “_” stands for the current record. So every column to be transformed will be multiplied by the value from column “Rate” within the current row.

I you paste the function code from the GitHub sample above into the advanced editor window of a blank query, you will see a sample code for this function that you can paste into a new query and explore all function arguments in separate steps.

Function sample code

Enjoy & stay queryious 😉

Comments (12) Write a comment

  1. Will this preserve query folding?
    I see many cases Table.AddColumn approach kept the query folding, but others dont.

    Reply

    • Hi Tommy,
      I doubt it will.
      Currently only using it on csv files and haven’t tested on SQL Server.
      /Imke

      Reply

  2. Looks really good, I made some alterations to make it behave more like Table.TransformColumns, except with access to the entire record for each row. Have a look at the below (apologies if formatting gets screwed up). Thanks!

    let
    func = // fnTransformMultipleColumns

    let
    fnTransformMultipleColumns = ( SourceTable as table, transformOperations as list ) as table =>
    let
    Transformations
    = // will accept a single transformation or a list of transformations, this standardizes the format

    if transformOperations{0} is list then
    transformOperations
    else
    { transformOperations },
    ColumnsToTransform = List.Transform ( Transformations, each _{0} ), // list of column names to replace
    TempColumnName = Text.NewGuid(), // unique name for new temporary column
    TableWithTempColumn = Table.AddColumn (
    // creates a new temporary column that holds a record of all new column values
    SourceTable,
    TempColumnName,
    each Record.Combine (
    // record of all new column values
    List.Transform (
    Transformations,
    ( Transformation as list ) as record =>
    Record.AddField (
    [],
    Transformation{0},
    Transformation{1}( _ )
    )
    )
    ),
    Type.Union (
    // types of new columns
    List.Transform (
    Transformations,
    each Type.ForRecord (
    Record.AddField (
    [],
    _{0},
    [ Type = try _{2} otherwise Any.Type, Optional = false ]
    ),
    false
    )
    )
    )
    ),
    TableWithoutOldColumns = Table.RemoveColumns (
    // removes old columns
    TableWithTempColumn,
    ColumnsToTransform,
    MissingField.Ignore
    ),
    NewTableUnsorted = Table.ExpandRecordColumn (
    // expands out new columns
    TableWithoutOldColumns,
    TempColumnName,
    ColumnsToTransform
    ),
    Result = Table.ReorderColumns (
    // sorts table in original sort order
    NewTableUnsorted,
    Table.ColumnNames ( SourceTable )
    )
    in
    Result
    in
    fnTransformMultipleColumns,
    documentation = [
    Documentation.Name = " Table_TransformMultipleColumns ",
    Documentation.Description
    = "Applies transformation(s) of the form { column, transform, type }.",
    Documentation.LongDescription
    = "Returns a table from the input <code>table</code> by applying the transform operation to the column specified in the parameter <code>transformOperations</code> (where format is { column name, transformation, optional type }). If the column doesn't exist, a new column with that name will be created.",
    Documentation.Category = " Table ",
    Documentation.Source = " www.TheBIcountant.com ",
    Documentation.Version = " 1.1 ",
    Documentation.Author = " Imke Feldmann, modified by Alex Groberman "
    ]

    in
    Value.ReplaceType ( func, Value.ReplaceMetadata ( Value.Type ( func ), documentation ) )

    Reply

  3. Pingback: Multi-Column Transformations in Power Query – Curated SQL

  4. You can do this in place (without creating new columns and deleting old ones) using Record.TransformFields.

    Here’s the basic approach leaving out types and optional arguments:

    let
    fnTransformMultipleColumns = (
    SourceTable as table,
    FieldList,
    TransformationFunction
    ) =>
    let
    Source = Table.FromRecords(
    Table.TransformRows(
    SourceTable,
    (row) =>
    Record.TransformFields(
    row,
    List.Transform(
    FieldList,
    (field) => {field, each TransformationFunction(Record.Field(row, field), row)}
    )
    )
    )
    )
    in
    Source
    in
    fnTransformMultipleColumns

    Reply

    • Hi Alexis,
      that’s also a nice approach, but I very much doubt that it will be faster than my formula.
      But if you show me that this is actually faster, I will swap the code (giving credits to you, of course) 🙂
      Thanks and cheers,
      Imke

      Reply

      • Yeah, I don’t think it’s faster. I’m curious about your intuition though. What is your basic reasoning about why one would be faster than the other?

        Reply

        • The Table.FromXXX have been slow for me so far, although I mostly use .FromRows or .FromColumns.
          Basically splitting up the table into its components and then re-assembling it.

          Reply

  5. At risk of overdoing it, I’ve cleaned up my above code one more time, I think the below should run optimally:

    let
    func = ( SourceTable as table, transformOperations as list ) as table =>
    let
    Transformations = if transformOperations{0} is list then transformOperations else { transformOperations },
    TransformNames = List.Transform ( Transformations, each _{0} ),
    TransformFunctions = List.Transform ( Transformations, each _{1} ),
    TransformTypes = List.Transform ( Transformations, each [ Type = _{2}? ?? type any, Optional = false ] ),
    TempColumnType = Type.ForRecord ( Record.FromList ( TransformTypes, TransformNames ), false ),
    TempColumnName = "1cb70c9c-8c2e-40ea-b04c-b741ab1bee64",
    TableWithTempColumn = Table.AddColumn (
    SourceTable,
    TempColumnName,
    each Record.FromList ( List.Transform ( TransformFunctions, ( Function as function ) as any => Function ( _ ) ), TransformNames ),
    TempColumnType
    ),
    TableWithoutOldColumns = Table.RemoveColumns ( TableWithTempColumn, TransformNames, MissingField.Ignore ),
    NewTableUnsorted = Table.ExpandRecordColumn ( TableWithoutOldColumns, TempColumnName, TransformNames ),
    Result = Table.ReorderColumns ( NewTableUnsorted, Table.ColumnNames ( SourceTable ) )
    in
    Result,
    documentation = [
    Documentation.Name = " Table_TransformMultipleColumns ",
    Documentation.Description = "Applies transformation(s) of the form { column, transform, type }.",
    Documentation.LongDescription
    = "Returns a table from the input
    table by applying the transform operation to the column specified in the parameter transformOperations (where format is { column name, transformation, optional type }). If the column doesn't exist, a new column with that name will be created.",
    Documentation.Category = " Table ",
    Documentation.Source = " www.TheBIcountant.com ",
    Documentation.Version = " 1.2 ",
    Documentation.Author = " Imke Feldmann, modified by Alex Groberman "
    ]
    in
    Value.ReplaceType ( func, Value.ReplaceMetadata ( Value.Type ( func ), documentation ) )

    Reply

  6. Ok, last one,but THIS ONE FOLDS!

    let
    func = ( SourceTable as table, transformOperations as list ) as table =>
    let
    Transformations = if transformOperations{0} is list then transformOperations else { transformOperations },
    TempPrefix = “1cb70c9c-8c2e-40ea-b04c-b741ab1bee64″,
    ColumnNames = List.Transform ( Transformations, each _{0} ),
    PrefixedColumnNames = List.Transform ( ColumnNames, each TempPrefix & _ ),
    CreateNewColumns = List.Accumulate (
    Transformations,
    SourceTable,
    ( CurrentTable, CurrentColumn ) =>
    Table.AddColumn ( CurrentTable, TempPrefix & CurrentColumn{0}, each CurrentColumn{1}( _ ), CurrentColumn{2}? ?? type any )
    ),
    RemoveOldColumns = Table.RemoveColumns ( CreateNewColumns, ColumnNames, MissingField.Ignore ),
    RenameNewColumns = Table.RenameColumns ( RemoveOldColumns, List.Zip ( { PrefixedColumnNames, ColumnNames } ) ),
    Result = Table.ReorderColumns ( RenameNewColumns, Table.ColumnNames ( SourceTable ) )
    in
    Result,
    documentation = [
    Documentation.Name = ” Table_TransformMultipleColumns “,
    Documentation.Description = “Applies transformation(s) of the form { column, transform, type }.”,
    Documentation.LongDescription
    = “Returns a table from the input table by applying the transform operation to the column specified in the parameter transformOperations (where format is { column name, transformation, optional type }). If the column doesn’t exist, a new column with that name will be created.”,
    Documentation.Category = ” Table “,
    Documentation.Source = ” http://www.TheBIcountant.com “,
    Documentation.Version = ” 1.3 “,
    Documentation.Author = ” Imke Feldmann, modified by Alex Groberman ”
    ]
    in
    Value.ReplaceType ( func, Value.ReplaceMetadata ( Value.Type ( func ), documentation ) )

    Reply

    • Here’s another surprise, when testing against an Oracle DB, this custom function actually folds even when an identical call using the built-in Table.TransformColumns does not fold. Pretty nifty!

      Reply

Leave a Reply