Memory efficient clustered running total in Power BI

Today I want to share a scenario where a running total calculation in the query editor saved a model that run out of memory when done with DAX:

Problem

The model couldn’t be refreshed and returned out of memory error with a calculated column in the fact table of over 20 Mio rows (from a csv-file). A running total should be calculated for each “JourneyID”, of which there were over 1 Mio in the table itself. This rose memory consumption during refresh by over 300 % – until it finally errored out:

Besetzung =
CALCULATE (
    SUM ( Fact[Entries] )
– SUM ( Fact[Exits] );
    FILTER (
        ALLEXCEPT ( Fact; Fact[JourneyID] );
Fact[StopId]
<= EARLIER ( Fact[StopId] )
    )
)

Solution

In the query editor, I grouped the fact-table by “JourneyID” and choose “All Rows”:

Group in the query editor to effectively partition your Fact-Table

Due to the fact that the fact table was sorted by the “JourneyID”, I could use GroupKind.Local and manually tweaked the resulting code like so :

Table.Group(#"Changed Type2", {"JourneyID"}, {{"All", each _, type table}}, GroupKind.Local)

Without this option, the calculation would have also errored out.

This effectively created table partitions, with one row for each “JourneyID”:

One table per JourneyID

I then added a custom column where I used my superfast M-code for running totals, referencing the column “All” with the partition in it:

Table.AddColumn(#"Grouped Rows", "Custom", each fnRT([All], "JourneyID", “Movements”))

Function Code

This M-code for running totals is amazingly fast (within M-dimensions 😉 ) and particularly plays its strengths on large sections for aggregation.

Performance

Memory consumption rose from around 1,700 MB to 2,000 MB, all very stable. Of course, refresh duration rose (by 150 % for that table). But for that use case, it was an acceptable price to pay.

Enjoy & stay queryious 😉

Right Aligning Text in Power BI: Format Improvements for Easy Profit&Loss Reports

Edit 14-Dec-2017: Now that we can right-align text measures in PowerBI, a SWITCH-measure like here: http://www.thebiccountant.com/2017/04/24/kpis-in-easy-profit-and-loss-for-powerbi/#comment-719 is the best alternative in my eyes. No need to read on 🙂

As shown in my last part of the Easy P&L-series, Power BI unfortunately still lacks some fundamental formatting options like:

  1. Right aligning text (please vote for it here: Right align text in Power BI – edit 15th Nov: We’re there: Right aligning text is available now: https://powerbi.microsoft.com/en-us/blog/power-bi-desktop-november-2017-feature-summary/)
  2. Display numbers in different formats within one column (either to be implemented as a “neutral” format for Switch-measures, where the referenced measures carry the formatting attributes already, or as a part of a formula-based conditional formatting) (Thanks Matt for the voting-link: Conditional format SWITCH measure)

So for the moment I choose between the following workaround-options:

  1. Display %-values in a separate column
  2. Format numbers as text and fill up with spaces so that all end up right aligned
  3. See the suggestion from Matt Allington in the comments below (very nice)

Right aligning text or percentage figure in new column

For both options the preparations are the same:

Read more

How to open a complex JSON record in Power BI and Power Query

Today I’ll show you a very useful technique how to deal with a JSON record that contains a wild mixture of different elements like this:

If you click on one of the expandable elements, their content will be shown, but you’ll loose all the “surrounding” information (metadata) that is visible now. This is often an issue, regardless if you want to create multiple tables from it to build a star-schema or just need a handful of fields or a denormalized table. But with a little help from M, you’re good to go:

Table.FromRecords( { MyJsonRecord } )

Will returns this:

With this move, every expansion of one of the expandable elements will keep the existing data in place:

Create one big flat table

Simply expand one element after each other to create a denormalized table

Create star schema

For multiple tables, keep this query and reference it to create you (sub-)tables. Always keep the Id-column as the key (!) to combine all the tables in your data model later. (Provided you use this in a function for multiple entities/series)

Best is to play with it, so just past this code into the advanced editor:

 

If your JSON-record has a different structure with “just” header and data in different fields, this technique will be more suitable for you: http://www.thebiccountant.com/2016/04/23/universal-json-opener-for-quandl/

Enjoy & stay queryious 🙂

Bill of Material (BOM) Explosion Part2: Costing in Excel and PowerBI

Following up on the BOM-explosion: A comment reminded me that I had missed to present the costing techniques to calculate the total costs of each (sub)-product.

Reversing the aggregation direction

What I had shown is how to “aggregate” from parent down to child-level to retrieve the total quantity of each component within a BOM (“How many of each (sub-) components do we have to order (or build) for that bike?”) (1).

Now we reverse the aggregation direction and aggregate the total (!) quantities back up to the parents (2).

And, as this doesn’t make too much sense in an economical way, the second aggregation will be their prices (3). This will give us the sum of all part-costs (“How much will the order of all the parts cost us?”). This is also very useful for planning purposes or reconciliation of prices for intermediate products with your master data.

And if your model holds sales-data as well, you can calculate the totals costs of your total sales within each period. (4)

VAR 1: Using classical hierarchies

Read more