How to store tables longer than 1,1 Mio rows in an Excel-sheet using Power Query and JSON

If you are working Power Query, you might come into a situation where you would like to make tables accessible from outside that file, that are longer than the 1,1 Mio rows who fit into an Excel-sheet. You can do that using the JSON-format and compression. That way you create a table in Excel that contains a JSON-text string, that can then be read and decoded again by Excel (or Power BI), using Power Query (or Get&Transform in Excel 2016).

You can download the file where a table with 2 Mio rows is compressed into a table with just 229 rows here: JSON_BreakRefreshChain4.xlsx

Simple idea:

Convert your table into JSON using the Json.FromValue-function. If we could export that JSON-file from here, we would be done already. But I haven’t figured out yet, if/how one could export or copy that from within the query-editor (apart from manual copy& paste, but that has also a limit of our good 1 Mio characters here). So if you have any idea about this, please share.

Workaround:

So as a workaround we convert the JSON to text and load that text into Excel so it can be accessed from outside. The only problem here is, that there is a limit of around 30k signs that can be loaded into a cell in Excel. So we have to chunk up the large text-string into smaller parts. That way the table-structure of the output is created. When converting back, we first combine all fields of the table back to a big string before reverting all the other steps as well.

Function that converts to the small table:

(Input) =>
let

// Debug function
// Source = LongTable,

   Source = Input,
   JSON = Json.FromValue(Source),
   Compress = Binary.Compress(JSON, Compression.Deflate),
   ToText = Binary.ToText(Compress),
   TextLength = Text.Length(ToText),
   ChunkUp = {0..Number.RoundUp(TextLength/30000)-1},
   ConvertToTable = Table.FromList(ChunkUp, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
   StartPosition = Table.AddColumn(ConvertToTable, "Start", each [Column1]*30000),
   Elements = Table.AddColumn(StartPosition , "Custom", each Text.Range(ToText , [Column1]*30000, List.Min({TextLength-[Column1]*30000,30000})))
in
   Elements 

Function to convert it back into its original values:

let
    Source = (LinkbackTable) =>

let
// Debug function
// LinkbackTable= LinkbackTable,

    Source = Excel.CurrentWorkbook(){[Name=LinkbackTable]}[Content],
    CombineText = Text.Combine(Source[Custom], ""),
    ConvertToBinary = Binary.FromText(CombineText),
    CompressBinary = Binary.Decompress(ConvertToBinary, Compression.Deflate),
    OpenJSON = Json.Document(CompressBinary),
    ConvertToTable = Table.FromList(OpenJSON, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    ExpandAllColumns = Table.ExpandRecordColumn(ConvertToTable, "Column1", Record.FieldNames(ConvertToTable[Column1]{0}), Record.FieldNames(ConvertToTable[Column1]{0}))
in
    ExpandAllColumns
in
    Source

If you consider using it to break the refresh chain (for problems like described here), you should be aware, that the transformation to JSON (and back) takes quite some time, so don’t cheer too soon before you’ve tested it.

Enjoy & stay queryious 🙂

Comments (5) Write a comment

    • Yes, this is very special. Actually, I’ve developed this technique for a completely different use case (which I haven’t blogged about yet). But as there has been some discussion around using JSON export recently, I decided to post a contribution here.

      Reply

  1. Imke

    i am in the same situation, but i solved in a different way.

    i receive a new file every Monday with new 200000 rows, i use powerquery to load all all the files from a folder , then load the data to powerpivot, i add an index column, then in the excel tabs, i have many tabs 1,2,3,4,5 then i used dax table to filter the table so it show always less than 1.1 million per tab

    at the same time, i use other tabs to have different “views” of my table, basically different aggregation used for different reports, so i keep the heavy lifting in this file, the report will just consume the results.

    I call it Master Work, it is used by 5 different dashboard and it is saved in a shared folder, my main table now is 8 Millions, the only drawback is the size of the file keep growing, which i find it ugly.

    my IT friend jokes that i manage to make Excel act like a database server 🙂

    Reply

    • Hi Mim,
      yes – great to see other solutions! Yours is probably faster than the JSON. The conversion really eats performance, but it has the advantage that it can deal with a very wide range of table-lengths.
      For a certain size of tables, a good compromise re performance and dynamic would probably be to chunk the long table up into 1,1 Mio row-parts and add them as columns beneath each other. This would probably result in a much better performance.
      Stay queryious, Imke 🙂

      Reply

  2. hopefully one day, Power BI desktop become a real self service BI solution with export to excel without limitations 🙂 who knows everything is possible

    Reply

Leave a Reply