Dynamic Benford’s Law measures in Power BI and Power Pivot

Benford’s Law compares the frequency distribution of leading digits to its (empirically proven) natural counterpart. This can then be used to detect fraud and errors.

Dynamic Benford's Law measures in Power BI and Power Pivot

Comparison between Benford distribution and actual

The green columns show how often each number should be the first digit in numbers that should follow the Benford-distribution. In black you’ll see the actual distribution of first digits within my table. Lastly, the red line shows the percentual absolute deviations between actual and Benford values.

In this example, there is a relatively high occurrence of numbers starting with 4 and 5. So this could be a sign for fraudulent manipulations.

The Benford Distribution

First you need a table with the Benford-distribution. Just load it as a disconnected table to your data model and name it “BenfordTable”. The “Value”-field from this table will be taken as x-axis for the visualisations. As the Benford-distribution is logarithmic, it can quickly be created with the following DAX-code:

Benford’s law Measure

The measure calculates how often a number starts with one of the BenfordNumbers (1..9) compared to the total number of rows in the FactTable.

If there are blank in the Value-column of the table to be analyzed, the measure has to be adjusted by filtering them out in the VAR CountTotal: ( CALCULATE(COUNTROWS(FactTable), FactTable[Value] <> BLANK()) )

Please note that you can create as many measures as you need in one model. So if you have multiple columns to investigate, just write a measure for each.

Benford’s law Variance Measure

To calculate the difference between target and actual, I use a MAXX-aggregation. This returns the maximum difference there is for a number. I also use this in a card visual if I want to add a data driven alert. So I don’t have to check and eyeball the chart regularly, but can just use this in a card visual in a dashboard. Then I’ll set a threshold value for the alert and will not miss any alarming developments.

 

Why DAX?

I’m using measures here instead of a calculated column (in the Benford-Table) because this allows me to filter and slice my table. This allows for making advanced and flexible analysis like comparing different values against each other or over time against the Benford distribution:

Dynamic Benford's Law measures in Power BI and Power Pivot

Benford’s Law Charts: Comparisons with various dimensions

 

Enjoy & stay queryious 😉

How to use regular charts on dynamic pivot tables in Excel

There are many good reasons not to use Pivot charts as described in this article i.e. – one I would like to add is the bug when using “invert if negative”: The inverted colour will disappear once you refresh your data.

If you use Power Query to create the pivot for your chart, a regular chart will be created on it: Gone are the restrictions that come with the Pivot-Chart and it will also adjust dynamically to new or deleted rows and/or columns.

How to create your regular charts on dynamic pivot tables:

Read more

Visualize query dependencies in Power BI with Sankey diagram

Edit 7th August 2016: You’ll find an easier “OneInAll”-approach in the PowerBI forum here. Thanks to David Moss for reporting an issue with the solution below.

PBI-file with new query: NewAllInOneCode.zip,

How to use:

Old post:

This week we got a new Sankey diagram. It has some advantages when visualizing dependencies between items/nodes, one being not to cut the labels. Time to share a tool that I use a lot when working with complex M-queries (in Power Query or Power BI). It shows how the queries are connected with each other:

Image1

Sankey diagram default

Left stand the queries who fetch the source data, parameters and/or user defined filters, then you see how they get transformed until to the right you see the queries that deliver your end-results. Especially when working with codes that has been written by someone else I find it very useful to get a quick oversight of what’s happening – as an excellent complement to the documentations 🙂

You can also move the nodes around to improve readability:

Image2

Sankey diagram with adjusted nodes

Read more

Happy Spreadsheet Day! (Or how to escape Excel-hell)

When reading horror-stories about Excel-hell describing how dangerous it is to use Excel in corporate environments, I cannot help but to think of this hilarious video describing the fatal consequences of acting without common sense: Just don’t do stupid things with it.

Although Excel comes nearly for free (in relation to what value it delivers) this doesn’t mean that you don’t need to invest in applying proper techniques (like in any other profession). There’s training and best practices for every different need.

But the best thing about Excel seems largely unknown still: Since the invention of Power Query it has never been easier to be save around Excel than before: A magic tool that can solve many of the problems that cause Excel hell: Repetitive tasks: The little adjustments and extensions that pile up when you use your workbook again and again and are often performed without realizing the (meanwhile complex) context of all the standard-Excel-elements involved: Power Query will prevent this mess. It will help you organize and automize your repetitive tasks in Excel.

Read more

Report Design with Power Query (1): Cascading Time Granularities

Cascading Time Granularities

When the client is not happy with the pivot report layout options on cubes in Excel, my usual reaction is “OK no problem, then we’re going to use cubefunctions instead”. This went well until recently: My client told me that he wanted a report like above: Years totals first, followed by quarters & months totals, but with growing number of years! All nicely close together, no gaps. Normally I don’t mind cubereports with dynamic table length (will blog about this later), but this is basically 3 dynamic reports under each other (yes they are still alive and will probably stay 😉 ):

  Screen_3Years

So it was time to test what Chris Webb has talked about in this blogpost: Using Power Query as a report authoring tool in Excel. The idea was to use Power Query’s append-function to attaching 3 different report: Year, Quarter, Month. They could then have different lengths, the append-operation would seamlessly stitch them together. The connection to the cubedata was easy enough, thanks to his brilliant step-by-step guide. But then there were some challenges to solve:

Read more