Dynamically flatten Parent-Child Hierarchies in DAX and PowerBI

If you use DAX to flatten Parent-Child hierarchies you will end up with a table that has a static number of columns (like described here). If you need a dynamic solution instead, which creates just as many level-columns as there are needed for the current data, you can use DAX’s helper-tool Power Query (or Get Data in Excel) or the query-editor in PowerBI, which uses the language M.

Another advantage of this solution is that you can script the table creation in one step (only flaw: You still need to manually adjust your hierarchy though): But it saves time in creating the table, especially if you have many levels.

2 simple steps

  1. copy the following function,
  2. add a new step to your current table where you call this function, filling in the following parameters:
    • table name (which is the name of the previous step in your M-query)
    • name of the column with the child-key
    • name of the column with the parent-key
    • name of the column who’s values shall be shown in the levels (can also be child-key)

 

Call fnFlattenPCHierarchyFunction

 

And this is the code, which you can also download below:

Downloads:

Download M-code:  Flatten PC Hierarchy M-code

Subscribers can also download the pbix-file, where this function in implemented and there’s also a query which enables you to follow-along the results of every step:

DynamicPCHierarchy.zip

Edit 2017-May-28: New files for download correcting error in “IsLeaf” (thx to FrankT in comments)

Enjoy & stay queryious 😉

 

Comments (38) Write a comment

  1. Hi Imke

    I have been looking for something like this. I’ve tried to use it by copy-paste the code, but I get error when calling the function. The error provided is something along the lines of:

    Expression.Error: Column ‘1190’ in the table was not found.

    1190 is the value in the child column. I have tried both with Number and Text. The error is consistent on every line.
    I am using power query in Excel.

    Any idea what triggered this?

    Thanks and regards
    Nikolaj

    Reply

    • Hi Nikolaj,
      this looks as if you’re trying to call the function in an “Add-column”-s step. Then the reference to the column would actually not return a reference to a whole column, but just a record field. That would explain the error message.
      Pls check out my sample-file: There step “Custom1” looks like this: = fnFlattenPCHierarchy(Source, “NodeKey”, “ParentKey”, “Name”)
      There the function is called “naked” and not within an “Table.AddColumn”-frame.
      Pls let me know if this was the reason.
      Thx & kind regards, Imke

      Reply

    • Thanks for the response Nikolaj, glad it worked.
      May I ask for which kind of task your are using this technique?

      Reply

  2. Hi Imke,

    thanks for this nice piece of code. I really enjoyed going through your formulas.

    I guess, you agree that leaves are nodes that are not parents of any other node. Thus, for instance Julie is a leaf!

    Therefore, I have replaced this slip of the pen

    = Table.AddColumn(HierarchyDepth, “IsLeaf”, each if [HierarchyDepth]=MaxBrowseDepth then true else false)

    with

    = Table.AddColumn(HierarchyDepth, “IsLeaf”, each not List.Contains(HierarchyDepth[ParentKey], [NodeKey]))

    Reply

    • Hi FrankT,
      Are You sure about this formula? When I replace Imke’s code with your part I get expresion error “Cannot find column [ParentKey] in table”.

      Reply

      • I’ve uploaded new files that incorporate Franks corrections – working fine 😉
        Thanks again FrankT!

        Reply

        • Hi Partnercenter,
          I’m very sorry – just recognized that there were still issues with “own” columnnames who caused the trouble. I’ve updated the files again (now also with blanks instead of parent dups).
          Pls let me know if you still run into trouble with it.
          Thx!

          Reply

  3. Great article, quick question I have a Manager Table with the ChildID and ParentID that is a guid D9665EF3-22D4-E611-80F8-5065F38AF871, there are 10K records, when I run the function it runs up to around 22K and then never finishes. I am wondering if the GUID is breaking the function?

    Reply

  4. Thanks for this Imke. This came in very handy recently as my team had to bypass connecting Power BI to a cube and go straight to the underlying tables in SQL. Question: is there a reason why you have the child levels that dont have children filled in with the parent name, instead of just blank?

    Reply

    • Thx David, very pleased to hear!
      No reason for that other than following one of the examples from the referenced article. If you want to keep them blank, replace the “PivotTable”-step with this code (much easier actually):
      PivotTable = Table.AddColumn(HierarchyPath, “PivotTable”, each Table.AddIndexColumn(Table.FromColumns({[PathItems]}), “Level”,1,1)),

      Reply

  5. v3 seems to have a bug. A sample of affected HierarchyPaths looks like
    1
    1|2
    2|1|3
    2|3|1|4
    3|1|2|4
    Interestingly, not all branches of the tree exhibit this behavior.

    Also, it seems to incorporate the above solution for leaving empty children blank without the way to go back to have the latest child duplicated till the latest level.

    Reply

    • Upon some investigation, there’s an assumption, in the ‘PathItems’ step, that the code for each child is greater than the code for any of its parents, so the function breaks for data where this is not true.

      Reply

      • sorry, missed your first comment.
        Busy currently & will look into this later. Thanks for the hint!!

        Reply

        • Yes, you were right, the List.Sort was causing errors, should have been List.Reverse instead. Sorry for that, I’ve updated the code.
          I’ve also adjusted the lookup for multiple parents, which was also not working correctly.
          Thanks again for reporting this error!

          Reply

          • The surprising usefulness of ignorance: might not have bothered to report had I known M enough to fix this for myself 🙂
            Is the below a good enough way to make hierarchy non-ragged?

            PivotTable = Table.AddColumn(HierarchyPath, “PivotTable”, each
            Table.AddIndexColumn(
            Table.FromColumns ( {List.Combine ({
            [PathItems], List.Repeat( {Record.Field(_,ChildKey)}, MaxBrowseDepth-[HierarchyDepth])
            })} )
            , “Level”,1,1)

          • This does look very advanced as it’s heavily nested & unfortunately doesn’t work if I paste it into my code. So would you mind sharing your full code please?

          • Thanks, got it: The closing bracket was missing:

            Table.AddColumn(HierarchyPath, “PivotTable”, each Table.AddIndexColumn( Table.FromColumns ( {List.Combine ({ [PathItems], List.Repeat( {Record.Field(_,ChildKey)}, MaxBrowseDepth-[HierarchyDepth]) })} ) , “Level”,1,1))

            Actually, this is very similar to my version from the query “fnFlattenPCHierarchyFollowAlong”:

            Table.AddColumn(HierarchyPath, “PivotTable”, each Table.AddIndexColumn( Table.FromColumns ( {List.Union ({ [PathItems], List.Repeat( {List.Last([PathItems])}, MaxBrowseDepth-[HierarchyDepth]+1)})}), “Level”,1,1))

            It works, for sure, but I still prefer my compact version:

            = Table.AddColumn(HierarchyPath, “PivotTable”, each Table.AddIndexColumn( Table.FromColumns({[PathItems]}), “Level”,1,1))

  6. Were this a Sales Department dimension hierarchy and were there sales directly attributable to managers (technically, not hierarchy leaves), the non-ragged version would seem to appeal more from the ease of joining to the fact table point of view.
    Besides function choices, one major difference I see is that you add +1 to the Repeat counter – why?

    Reply

    • You mean step “MaxBrowseDepth”? This row is redundant (leftover from a different version that I’ve forgotten to delete). You can ignore or delete it.

      Reply

  7. Ouch, I missed the comment tree! This was meant to continue the discussion of making the hierarchy non-ragged, and referred to the ‘MaxBrowseDepth-[HierarchyDepth]+1’ part.

    Reply

      • I’m referring to the fnFlattenPCHierarchyFollowAlong query you mentioned a couple of comments back

        Reply

        • You mean the step “PivotTable” there?
          = Table.AddColumn(HierarchyPath, “PivotTable”, each Table.AddIndexColumn(Table.FromColumns({List.Union({[PathItems], List.Repeat({List.Last([PathItems])}, MaxBrowseDepth-[HierarchyDepth]+1)})}), “Level”,1,1))

          Good question! This is due to the non-intuitive but very useful behaviour of List.Union: https://msdn.microsoft.com/en-us/library/mt260726.aspx
          This function is supposed to match duplicate values as part of the Union. This will be done in a very special way (this is just my own observation, haven’t seen any official documentation for it yet, so maybe there is actually a different logic behind it): For every element of the first list, it will check if there is a duplicate element in the other list. It will then eliminate the FIRST element of the list and leave the other dups in place. So if you have: List.Union({ {1, 2, 2}, {1, 1, 2} }) = the result will be: {1,2,2,1}. The first 1 of the 2nd list has been matched as a duplicate against the first 1 of the 1st list, but the 2nd 1 remains and stands at the end. The order of the elements is interesting here, as the 1 from the 2nd list comes last. So it seems to take the first list first and then adds all elements from the second list that have not been eliminated as part of the dups-removal.

          In our case we are always filling up with dups (taking the last element of list [PathItems]), of which the first one will be removed from the List.Union. Therefore adding one if it to it again (+1).

          Reply

  8. Hi Imke,
    Somewhat new to the code and query side of Power BI. I thought I had it set up correctly, but I’m getting the following error:

    Expression.Error: There weren’t enough elements in the enumeration to complete the operation.

    My table has about 90K rows of data. Any idea what could be causing this or where to start looking to solve it?
    Mike

    Reply

    • One possible reason for this error-message is duplicates in your table: The column-selection in the parameters must not have duplicates for this to work.

      Reply

  9. Hi, Imke!
    I used your function to deploy the company’s organizational structure. I have a staffing table, in which about 12 thousand lines, 30 columns, the maximum nesting of the hierarchy is 7 levels. Performing your function on my data set takes about 15 minutes. What could I have done wrong? Is it possible to speed up the implementation? Is there a Power Query debugger in which we can see the duration of execution of the query elements and the amount of used memory in each of the steps?

    Reply

    • I’d recommend take just the columns you need for the parent-child-allocation and then merge back the other (27-ish?) columns after the operation.
      Unfortunately currently there is no analysing tool like you’ve mentioned.

      Reply

  10. Thank you for your help, Imke.
    According to your advice after the some necessary transformations of the original table- Tbl1, I used it as the source of the new query Tbl2, left only 3 columns and applied the function. Execution about 3-4 minutes. Then, in a new query, Tbl3 used the Tbl1 as source and added Tbl2 by the key field. Expanding the columns of Table 2 takes about 3-4 minutes. Is it possible to improve such an algorithm for acceleration?

    Reply

    • Hi Denis,
      very sorry for this slow code – in one the latest update I missed a buffer.
      But now I’ve completely reworked it, please test with your data and report back: New Function
      It does the split of the vital columns automatically, so you just have to feed in your original table here.
      Thanks and cheers, Imke

      Reply

      • Imke, this is a great solution. Works 2-4 times faster. Thank you so much!
        P.S. for universality, it is better to remove the hard binding to the column names. Not in all places of the code used function arguments.

        Reply

Leave a Reply