Trimming text with custom characters in Power BI and Power Query

When cleaning dirty data, you might have used the Trim-function (Text.TrimStart or Text.TrimEnd) to delete leading or trailing whitespace from your strings. However, did you know that you could use these functions as well to delete any other characters as well from the start or end of a string? Trimming text with custom characters is pretty straightforward:

Task: Trimming text with custom characters

Say you have a column with values like so

Trimming text with custom characters

and want to delete every number at the end and also every “-” that is directly connected with a number. So that the final output shall look like so:

Trim custom characters at the end of a string.

Optional parameter

By default, you feed just one argument into the Text.TrimStart or Text.TrimEnd function: The string whose whitespace characters shall be removed.

Text.TrimEnd(text as nullable text, optional trim as any) as nullable text

But the second argument lets you define a list of your own characters to be removed respectively. So I can create a list with all the characters that shall be removed from the end like so:

{"0".."9"} & {"-"}

This concatenates 2 lists: The first list contains 10 elements: All numbers as strings. The second list has just one element in it: “-“. I have to put this element into a list as well for being able to use the ampersand (“&”) as an easy concatenator here.

So the full expression for the “Add custom column” dialogue looks like so:

Text.TrimEnd( [MyColumnName], {"0".."9"} & {"-"} )

To see this in action, you can simply paste this code into the advanced editor and follow the steps:

Enjoy and stay queryious 😉

Date.Networkdays function for Power Query and Power BI

Today I’m going to share my custom NETWORKDAYS function for Power Query with you that uses the same syntax than its Excel-equivalent.

NETWORKDAYS function

This function’s first 3 parameters work just like the Excel function and there is a 4th parameter that allows adjusting the day on which the week shall start:

  1. Start as date
  2. End as date
  3. optional holidays as list of dates
  4. optional number to change the start of the week from Monday (default: 1) to any other day (2 would mean that the week starts on Tuesday instead)

The function comes with a UI that lets you first choose a table containing the holidays and then choose the column with the holiday dates. Read more

Dynamic Benford’s Law measures in Power BI and Power Pivot

Benford’s Law compares the frequency distribution of leading digits to its (empirically proven) natural counterpart. This can then be used to detect fraud and errors.

Dynamic Benford's Law measures in Power BI and Power Pivot

Comparison between Benford distribution and actual

The green columns show how often each number should be the first digit in numbers that should follow the Benford-distribution. In black you’ll see the actual distribution of first digits within my table. Lastly, the red line shows the percentual absolute deviations between actual and Benford values.

In this example, there is a relatively high occurrence of numbers starting with 4 and 5. So this could be a sign for fraudulent manipulations.

The Benford Distribution

First you need a table with the Benford-distribution. Just load it as a disconnected table to your data model and name it “BenfordTable”. The “Value”-field from this table will be taken as x-axis for the visualisations. As the Benford-distribution is logarithmic, it can quickly be created with the following DAX-code:

Benford’s law Measure

The measure calculates how often a number starts with one of the BenfordNumbers (1..9) compared to the total number of rows in the FactTable.

If there are blank in the Value-column of the table to be analyzed, the measure has to be adjusted by filtering them out in the VAR CountTotal: ( CALCULATE(COUNTROWS(FactTable), FactTable[Value] <> BLANK()) )

Please note that you can create as many measures as you need in one model. So if you have multiple columns to investigate, just write a measure for each.

Benford’s law Variance Measure

To calculate the difference between target and actual, I use a MAXX-aggregation. This returns the maximum difference there is for a number. I also use this in a card visual if I want to add a data driven alert. So I don’t have to check and eyeball the chart regularly, but can just use this in a card visual in a dashboard. Then I’ll set a threshold value for the alert and will not miss any alarming developments.

 

Why DAX?

I’m using measures here instead of a calculated column (in the Benford-Table) because this allows me to filter and slice my table. This allows for making advanced and flexible analysis like comparing different values against each other or over time against the Benford distribution:

Dynamic Benford's Law measures in Power BI and Power Pivot

Benford’s Law Charts: Comparisons with various dimensions

 

Enjoy & stay queryious 😉

Tidy up Power BI models with the Power BI Cleaner tool

Edit 23th June 2020: Updated version (see download link at the end of the article) to reflect changes with the vpax fileformat in DAX Studio versions V11.0 upwards.

The VertiPaq-Analyzer tool is one of the great community tools that I really cannot live without. It gives me a great overview of all elements in my model and identifies potential performance problems by showing the storage requirements of each column. So when seeing expensive columns, the first question that arises is: “Do I really need this column or could I delete it?”. Luckily, this can now be answered with my new Power BI Cleaner tool. This tool shows the usage of all columns (and measures) within the tables of the VertiPaq Analyzer.

Power BI Cleaner shows unused columns in the VertiPaq-tables

Power BI Cleaner tool

So whenever there is no entry in the column “Where Used” you can go ahead and eliminate the column (or measure) from the model. Well – with two exception actually: Fields used in the definition of incremental load policies are currently not identified as well as used fields in calculation groups. So make sure to consider this before running wild 😉 Read more

Advanced transformations on multiple columns at once in Power BI and Power Query

You can apply simple transformations to multiple columns at once in Power Query using the UI only. In this article I show how you can apply advanced transformations on multiple columns at once instead. You can also use this to use custom functions instead. And lastly for the lazyefficient fans of custom M-functions: You will get a new “TranformAllMyColumnsAtOnceHowILikeIt”-function as well 😉

Background

The Transform-tab in the query editor is sensitive to the columns you select. So if you select multiple number columns for example, some number transformations will be greyed out and are therefore not accessible:

Some symbols are greyed out, Advanced transformations on multiple columns at once in Power BI and Power Query, Power BI, Power Query, Power BI Desktop

Some symbols are greyed out

So how could I then multiply all my columns by 10 for example, as the symbol for multiplication is greyed out? Read more

Dynamically create types from text with Type.FromText in Power Query and Power BI

In this article I’ll show you how to create types from text in Power Query, enabling you to dynamically change types via functions for example. It’ll come out as a custom Type.FromText function which has been asked for in the comments of this article: https://www.thebiccountant.com/2017/01/09/dynamic-bulk-type-transformation-in-power-query-power-bi-and-m.

Problem

To transform a column to type text can be done like so:

Table.TransformColumnTypes(Source,{{"Column1", type text}})

This transforms the “Column1” from table “Source” to type text.  Now, if you want to make the type dynamic and move it to a function parameter like so:

(VariableType as type) =>

Table.TransformColumnTypes(Source,{{"Column1", VariableType}})

This returns a function dialogue as follows:

Read more