Automatically expand all fields from a JSON document in Power BI and Power Query

If you work with JSON documents that are new to you, it can be very helpful to fully expand JSON to see at a glance what’s in there. The following function has you covered for this task. It returns a table with all values in column “Value” and additional columns describing where that value came from in a hierarchical form, no matter how deep the nesting goes:

Example

See below what this function does for the following JSON :

Automatically expand JSON: Sample JSON

{"A":"A Record", "B":["ListItem1", {"C":"A nested Record", "D":"Another nested Record"}], "E":{"F":["NestedListItem1","NestedListItem2","NestedListItem3"]}}

Automatically expand JSON: Result

 

  • Value: The respective values
  • Level: Main levels.
  • Sort: Sort column with hierarch to display in report
  • SortBy: Sort your “Sort”-column in the data model by this column: It will be filled up with “0” according to the maximum value within the respective position. This effectively allows sorting by number, although it is a text-field.
  • Name-columns: They display the hierarchical location of the value with regards to JSONs record field names.

The Function

 


let
func = (JSON) =>
let
Source = JSON,
ParseJSON = try Json.Document(Source) otherwise Source,
TransformForTable =
if Value.Is(ParseJSON, type record) then
Record.ToTable(ParseJSON)
else
#table(
{"Name", "Value"},
List.Zip({List.Repeat({0}, List.Count(ParseJSON)), ParseJSON})
),
AddSort = Table.Buffer(Table.AddColumn(TransformForTable, "Sort", each 0)),
LG = List.Skip(
List.Generate(
() => [Next = AddSort, Counter = 1, AddIndex = #table({"Sort"}, {{""}})],
each [AddIndex]{0}[Sort] <> "End",
each [
AddIndex = Table.AddIndexColumn([Next], "Index", 0, 1),
MergeSort = Table.CombineColumns(
Table.TransformColumnTypes(
AddIndex,
{{"Sort", type text}, {"Index", type text}},
"en-GB"
),
{"Sort", "Index"},
Combiner.CombineTextByDelimiter(".", QuoteStyle.None),
"Sort"
),
PJson = Table.TransformColumns(
MergeSort,
{{"Value", each try Json.Document(_) otherwise _}}
),
AddType = Table.AddColumn(
PJson,
"Type",
each
if Value.Is([Value], type record) then
"Record"
else if Value.Is([Value], type list) then
"List"
else if Value.Is([Value], type table) then
"Table"
else
"other"
),
AddStatus = Table.AddColumn(
AddType,
"Status",
each if [Type] = "other" then "Finished" else "Unfinished"
),
Finished = Table.SelectRows(AddStatus, each ([Status] = "Finished")),
Unfinished = Table.SelectRows(AddStatus, each ([Status] = "Unfinished")),
AddNext = Table.AddColumn(
Unfinished,
"Next",
each if [Type] = "Record" then {[Value]} else [Value]
),
RemoveCols = Table.RemoveColumns(AddNext, {"Value", "Type", "Status"}),
ExpandNext = Table.ExpandListColumn(RemoveCols, "Next"),
AddIndex2 = Table.AddIndexColumn(ExpandNext, "Index", 0, 1),
MergeSort2 = Table.CombineColumns(
Table.TransformColumnTypes(
AddIndex2,
{{"Sort", type text}, {"Index", type text}},
"en-GB"
),
{"Sort", "Index"},
Combiner.CombineTextByDelimiter(".", QuoteStyle.None),
"Sort"
),
TransformRecord = Table.TransformColumns(
MergeSort2,
{
{
"Next",
each try
Record.ToTable(_)
otherwise
try
if Value.Is(Text.From(_), type text) then
#table({"Value"}, {{_}})
else
_
otherwise
_
}
}
),
FilterOutNulls = Table.SelectRows(TransformRecord, each [Next] <> null),
Next =
if Table.IsEmpty(FilterOutNulls) then
#table({"Sort"}, {{"End"}})
else if Value.Is(FilterOutNulls[Next]{0}, type table) = true then
Table.ExpandTableColumn(
FilterOutNulls,
"Next",
{"Name", "Value"},
{"Name." & Text.From([Counter]), "Value"}
)
else
Table.RenameColumns(FilterOutNulls, {{"Next", "Value"}}),
Counter = [Counter] + 1
],
each Table.AddColumn([Finished], "Level", (x) => _[Counter] – 2)
)
),
Check = LG{2},
Combine = Table.Combine(LG),
Clean = Table.RemoveColumns(Combine, {"Status", "Type"}),
Trim = Table.TransformColumns(Clean, {{"Sort", each Text.Trim(_, "."), type text}}),
// Dynamic Padding for the sort-column so that it sorts by number in text strings
SelectSort = Table.SelectColumns(Trim, {"Sort"}),
SplitSort = Table.AddColumn(
SelectSort,
"Custom",
each List.Transform(try Text.Split([Sort], ".") otherwise {}, Number.From)
),
ToTable = Table.AddColumn(
SplitSort,
"Splitted",
each Table.AddIndexColumn(Table.FromColumns({[Custom]}), "Pos", 1, 1)
),
ExpandTable = Table.ExpandTableColumn(ToTable, "Splitted", {"Column1", "Pos"}),
GroupPos = Table.Group(
ExpandTable,
{"Pos"},
{{"All", each _, type table}, {"Max", each List.Max([Column1]), type text}}
),
Digits = Table.AddColumn(GroupPos, "Digits", each Text.Length(Text.From([Max]))),
FilteredDigits = List.Buffer(Table.SelectRows(Digits, each ([Digits] <> null))[Digits]),
SortNew = Table.AddColumn(
Trim,
"SortBy",
each Text.Combine(
List.Transform(
List.Zip({Text.Split([Sort], "."), List.Positions(Text.Split([Sort], "."))}),
each Text.PadStart(_{0}, FilteredDigits{_{1}}, "0")
),
"."
)
),
FilterNotNull = Table.SelectRows(SortNew, each ([Value] <> null)),
Reorder = Table.ReorderColumns(
FilterNotNull,
{"Value", "Level", "Sort", "SortBy"}
& List.Difference(
Table.ColumnNames(FilterNotNull),
{"Value", "Level", "Sort", "SortBy"}
)
),
Dots = Table.AddColumn(
#"Reorder",
"Dots",
each List.Select(Table.ColumnNames(#"Reorder"), (l) => Text.StartsWith(l, "Name"))
),
// This sort is just to view in the query editor. When loaded to the data model it will not be kept. Use "Sort by column" in the data model instead.
Sort = Table.Sort(Dots, {{"SortBy", Order.Ascending}})
in
Sort,
documentation = [
Documentation.Name = " Table.JsonExpandAll ",
Documentation.Description
= " Dynamically expands the <Json> Record and returns values in one column and additional columns to navigate. ",
Documentation.LongDescription
= " Dynamically expands the <Json> Record and returns values in one column and additional columns to navigate. Input can be JSON in binary format or the already parsed JSON. ",
Documentation.Category = " Table ",
Documentation.Source = " . https://wp.me/p6lgsG-Ur . ",
Documentation.Version = " 1.2: Added column [Dots] (22/02/2019)",
Documentation.Author = " Imke Feldmann: www.TheBIccountant.com . ",
Documentation.Examples = {[Description = " ", Code = " ", Result = " "]}
]
in
Value.ReplaceType(func, Value.ReplaceMetadata(Value.Type(func), documentation))

How the code works

  • The main elements in JSON are records and lists and they can appear in many different combinations. So to handle them equally, I often convert them into a common type so that they can be further processed equally (row 6, 27 and 32).
  • I use List.Generate (row 10-42) to repetitively check whether the returned values are further expandable or not. If they are not expandable, they go into “Finished” (row 25) and if they need further expanding, they land in “Unfinished” (row 26) and will be expanded further.
  • Row 48-56 contain a dynamic padding that will convert the Sort-entry “2.1” into “02.1” in column “SortBy”, if there is also a “10.x” in the column. This will make sure that the (alphabetical) sort returns “2.1” before “10.1”. (And if there is a 100.x it will convert to “002.1” and so on…)

I’m pretty sure that this code can be further improved, as it has been evolved for quite some time and I didn’t re-engineer it. So if you’re up for it, please don’t hesitate to post an improved version!

For large JSONs, the table can get very long and it could be beneficial to view it in a more compact form (actually a form that would provide tables for a relational model). I will show this in an upcoming article, so if you’re interested in it, make sure to subscribe to my blog not to miss it.

Enjoy & stay queryious 😉

Comments (33) Write a comment

  1. Pingback: Flattening JSON In Power BI – Curated SQL

    • Yes, I’m coding directly in Power Query: Either in the formula bar or the advanced editor.

      Reply

  2. Pingback: Power BI Sudoku, Custom fonts, DAX and more... (June 25, 2018) | Guy in a Cube

  3. Hello,

    long time fan – first time post.

    I am experiencing difficulties using this –

    do i need to replace in the script to reference MY json ?

    when the function is built – it won’t let me select a table to run the function – when I type my table into the function input it isn’t running.

    my json table is not expanded at all.

    the function is created – it just wont let me type in the json in the paramater to invoke the function.

    any insight is greatly appreciated.

    Reply

    • Hi Jeff,
      my function doesn’t expect a table as an input, but a JSON-format instead. That’s basically a text-file/string in JSON-format.
      So if you have transformed your JSON to a table already, you have to delete those steps and feed the raw JSON to my function.
      /Imke

      Reply

      • Hi,
        Great Work!!
        I am new to usinf functions in power BI, do you have some steps scrrenshots hoe to give the JSON input to this function (my json is from cosmos db), Thank you

        Reply

  4. I am getting results in rows rather I would prefer to be displayed in columnar structure. Is it possible to get the results like Column – Headers and Rows – values?
    eg:
    Value Name Name1
    1 data id
    aaa data name
    2 data id
    bbb data name

    Expected output:

    Id name
    1 aaa
    2 bbb

    Reply

    • Yes: You can pivot the data to achieve this.
      For this you want to add a column that cuts off the figures after the last . within the Sort-Column.

      But due to the (mostly) nested structure of JSON-files, one table with all the results might not be the ideal solution. So you might want to filter the data according to the different tables you want to create out of the data before.

      Reply

      • Hi,

        Thanks a lot 🙂

        I have tried suggested solution. It is working and able to retrieve data in columnar structure.
        But when it comes to more than one level(if nested structure), not able to achieve solution dynamically.

        eg: Name.1,Name.2,Name.3 etc…

        Any suggestions please?

        Thanks,
        Karthiga

        Reply

    • @karthiga m will you please give some screenshots how it works in your case. I am new to PBI. Thanks

      Reply

  5. Hi Imke Feldmann,

    Any help is much appreciated for the above queries.
    Thanks in advance.

    Regards,
    Karthiga

    Reply

    • Hi Karthiga,
      sorry for late response:
      You need to create a key-column that combines the values from all names-columns dynamically. Just add a columns with this code:
      Text.Combine(List.Transform([Dots], (x) => Text.From(Record.Field(_,x))), ” | “)

      Reply

      • Thanks Imke.
        But I am not quiet understanding that, what do you mean by “Dots”. Is this code to be added inside loop or outside (end)?
        Can you please help?

        Reply

        • Sorry, updated the function code on GitHub. That column is now included – please use that new function code.
          /Imke

          Reply

  6. Hi Imke,

    Yeah it worked finally. Thanks a lot 🙂
    I have added few more steps ‘ToPivot’ table data in a readable format.

    T =Table.AddColumn(#”Added Custom”,”ToPivot”,each Text.BeforeDelimiter([Sort],”.”, {2,RelativePosition.FromStart})),
    #”Removed Columns” = Table.RemoveColumns(T,{“Level”, “Sort”, “SortBy”, “Name”, “Name.1”, “Name.2”, “Name.3”, “Dots”}),
    Final = Table.Pivot(#”Removed Columns”, List.Distinct(#”Removed Columns”[Custom]), “Custom”, “Value”)

    Regards,
    Karthiga

    Reply

    • @Kathiga @Admin @ can you post your code please? I’m trying to follow it and I don’t have the #”Added Custom” step. What did you use for columnGenerator?

      Reply

    • Pretty sure that first line should read:
      = Table.AddColumn(#”Added Custom”, “ToPivot”, each Text.BeforeDelimiter([Sort], “.”, {0, RelativePosition.FromEnd}))

      Reply

  7. Hello,

    After using the function, my data is being organized like this:

    COLUMN 1 – (Names.1) COLUMN 2 – (Values)
    “ID” 1
    “Name” John Doe
    “Rating” 3
    “ID” 2
    “Name” Jane Doe
    “Rating” 4
    (etc.)

    When I pivot COLUMN 1 in Power Query, the values become columns and headers as expected, but only a single row is displayed, and the value for each row cell is a “Error”. When I click on “Error”, I get the following message:

    “Expression.Error: There were too many elements in the enumeration to complete the operation.
    Details:
    List”

    Not sure what I’m doing wrong. Any ideas?

    Thanks.

    Reply

    • If you want to pivot your data, you need to select “Don’t aggregate” in the advanced options of the pivot dialogue.
      But you also need another column that indicates the rowID. Cutting off the last two elements in the column “Sort” might do that trick. – See the conversation with the user “Karthiga” in the comments-section of this post.
      /Imke

      Reply

      • Don’t you mean the “SortBy” column? Also, why the last two instead of the last one like you indicated in the other, earlier comment?

        Reply

  8. I feel kind of dumb but I couldn’t get this to work with any json. I put the code in, it asked me for the json file and no matter what I did it would give me back an error about not being able to convert the list. is there something I am missing here?

    Reply

    • Do you reference your JSON in a different query? You’d get an error message like this if you reference it like so:
      fnJSON(“YourJSONQuery”) – so if you fill in a text-value for your JSON instead of a reference to the query like so: fnJSON(YourJSONQuery). So make sure to eliminate the “””.

      Reply

  9. As i m using cosmos connector to get my data from cosmos database, is this function wil still work?
    I am stuck here, can you please help me in this?

    Reply

  10. Hi. Can you explain exactly how to pass the JSON into the function? I have a query called “sample” containing my JSON loaded into PowerBI; I have your function in another query called ExpandJSON. When I pass sample into the function, it fails. What am I doing incorrectly?

    Reply

    • Hi Andrew,
      you can pass the JSON either as a text or as a binary.
      So if you want to feed the content of a file to it, you have to use File.Contents(“ThePathToYourFile”) as an input.
      /Imke

      Reply

  11. What about if the .json structure was hierarchical (nested tables with the same structure)? I would like to dynamically expand the “same” column for each level by adding the “same” column for each level up to the last one (as long as the column to be expanded contains non-null values). The manual steps are like these:

    #”Expanded collaborators1″ = Table.ExpandListColumn(#”Expanded tree.data1″, “collaborators”),
    #”Expanded tree.data2″ = Table.ExpandRecordColumn(#”Expanded collaborators”, “collaborators”, {“collaboratorCode”, “collaborators”}, {“collaboratorCode1”, “collaborators1″}),
    #”Expanded collaborators2″ = Table.ExpandListColumn(#”Expanded tree.data2”, “collaborators1″),
    #”Expanded tree.data3″ = Table.ExpandRecordColumn(#”Expanded collaborators1”, “collaborators1”, {“collaboratorCode”, “collaborators”}, {“collaboratorCode2”, “collaborators2″}),
    #”Expanded collaborators3″ = Table.ExpandListColumn(#”Expanded tree.data3”, “collaborators2″),
    #”Expanded tree.data4″ = Table.ExpandRecordColumn(#”Expanded collaborators2”, “collaborators2”, {“collaboratorCode”, “collaborators”}, {“collaboratorCode3”, “collaborators3”}), …

    Thanks!

    Reply

  12. Hi ,
    A very good function indeed.
    However I am new to Power Query and wanted to use the function to extract data from a Nested Json which is in a Json file as request Sample. How can I call your function to extract data from this Json File kept in a folder. Also if we can parameterize this path of file much better (Dynamic Path) and how to do that.
    Also there was an error in Line 107 ” each Table.AddColumn([Finished], “Level”, (x) => _[Counter] – 2))), ” . I changed dash to coma as follows “each Table.AddColumn([Finished], “Level”, (x) => _[Counter] ,2)))”. Will that Impact the functionality?

    Reply

    • I’m just a user and I could be mistaken but line 107 is a decremental counter (the “dash” is a minus instructing that the _[Counter] be decremented by 2).

      Also your Power Query needs to first pull your Json file with something like
      Source = File.Contents(“path to your file”).
      Then you pass Source to the expand function. Something like
      expandFile = nameOfFunction(Source).
      The way I have implemented this is put all the above code into it’s own query and named that query funcExpand. Power BI knows it is a function by virtue of the syntax
      let func =
      (JSON) =>
      Then in a seperate query I pull in my Json data and use this function to expand it. I’m using a url to get the Json so my code uses Web.Contents rather than File.Contents like you would need to use.
      let
      Source = Json.Document(Web.Contents(url)),
      #”Expand Table” = funcExpand(Source),

      Reply

  13. Thanks for this.
    I’ve implemented this function for my nested Json data, but now I don’t understand how to navigate it. I feel like the numerous columns are meant to be breadcrumbs, but for whatever reason I can’t follow them. Is this solution compatible with reporting or data analysis? How do I link up all the breadcrumbs to make a report?

    Reply

  14. Hi – I’m working with a large JSON file data set and this was really helpful. I’m trying to apply it to multiple JSON files (and have updated the sources to folder.files) but keep getting the expression error “We cannot convert a value of type Table to type List.” Any idea where I’ve gone wrong?

    Reply

Leave a Reply