How to compare everything with everything using Power BI or Power Query

 

A question in Mr. Excel-Forum about how to design your data model if you want to compare values from 10 different (fact-) tables with each other got me curious: As this simply felt so wrong… and where there are 10 tables today, there will be probably be 12 tables next …?

Time for Power Query to play out its strengths on dynamic approaches then: Key is to create one consolidated table from all the different input-tables with an additional column that contains the name of each source table.

If your data sits in Excel tables, you can find the description on how this works in the new Power Query book from Ken Puls and Miguel Escobar, starting on page 47 (key is to keep the name-column!).

In my example I’m using web-data that already has the format of a consolidated table, comparing inflations rates of different countries with each other. So the data-structure looks like this:

1DataStructure

You then start a new query that refers to your source data and merge it with the source again on the year-column. The year-column shall stay the same, as we don’t want to compare different years against each other, but only different countries (Sources).

This will actually create a crossjoin of all countries against the country in the respective row – but within the same year. Last step is to create an additional column that calculates the difference between the values. This tells you how the inflation rate of the country to compare against is in relation to the source-country:

2Table

& this is the code:

let
Source = SourceData,
SelfMerge = Table.NestedJoin(Source,{"Year"},SourceData,{"Year"},"NewColumn",JoinKind.LeftOuter),
ExpandCrossjoin = Table.ExpandTableColumn(SelfMerge, "NewColumn", {"Source", "Year", "Value"}, {"Crossjoin.Source", "Crossjoin.Year", "Crossjoin.Value"}),
CalcDiff = Table.AddColumn(ExpandCrossjoin, "Diff", each -[Value]+[Crossjoin.Value]),
ChgFormats = Table.TransformColumnTypes(CalcDiff,{{"Value", type number}, {"Diff", type number}})
in
ChgFormats

You can filter, slice&dice as you like and create pretty fancy reports with this technique 🙂

HowToCompareEverythingWithEverything

 

OK – technically at the end it’s just a simple crossjoin – the interesting part is actually what kind of problem we have solved with it: Combine multiple tables to compare against each other. So I’ll take this as the kickoff for my series on consolidation & reconciliation techniques then, so stay tuned!

Enjoy & stay queryious 🙂

Comments (6) Write a comment

  1. Hallo Frau Feldmann,

    vielen Dank für diese wunderbare Fallstudie; durch Zeilen und Spaltensortierung nach Wert in Kombination mit bedingter Formatierung lässt sich in einem weiteren Schritt aus TableAll ganz einfach eine Heatmap zur Clusteranalyse erstellen; wirklich sehr praktisch!

    Reply

    • Thanks for this nice feedback! Do you want to share your workbook? I’d be happy to post this here.

      Reply

  2. Ich kann Ihnen die Datei gerne zur Verfügung stellen. Es ist mir aber nicht möglich, über das Kommentarfeld eine Datei hochzuladen und/oder einen Screenshot einzufügen!?

    Reply

  3. Thank you very much for the workbook – very nice example of the difference colours can make!
    It’s now integrated in the file to be downloaded in the main article.

    Reply

  4. I have a question.

    What do you do if you have missing data but consider them to be zero.

    For example:

    Dataset A
    Q1 Q2
    Pk1 345 85
    Pk2 456 506
    Pk6 58 569
    Pk9 89 304

    Dataset B
    Q1 Q2
    Pk1 349 859
    Pk2 4589 3040
    Pk3 858 595
    Pk9 99 858
    Pk10 45 889

    It should consider zeros where data does not exist or ‘PK’ is not there.

    I can manipulate my data to be like this in Excel

    For Dataset A
    Q1 Q2
    Pk1 345 85
    Pk2 456 506
    Pk3 0 0
    Pk6 58 569
    Pk9 89 304
    Pk10 0 0

    For dataset B

    Q1 Q2
    Pk1 349 859
    Pk2 4589 3040
    Pk3 858 595
    Pk6 0 0
    Pk9 99 858
    Pk10 45 889

    Now i can merge the data in Power BI, unpivot them and compare,
    but how do i get to what i showed above i did in Excel

    Thanks

    Reply

Leave a Reply