Bulk-extracting Power Query M-code from multiple pbix files in Power BI

Edit: Unfortunately this method isn’t working any more for files with the latest Power BI file formats (including the enhanced metadata format).

If you want to audit or analyse the M-code of multiple Power BI pbix-files at once, you start with either:

  1. a from-folder query where you filter all files of interest or
  2. a table with the full file-path-specification of the files to be analysed in “Column1”.

Then you add a column where you call the function that extracts the M-code:

Function to extract the M-code


(Filename as text) =>
let
// Unz-function from: https://querypower.com/2017/03/22/extracting-power-queries-in-m/
Unz = (binaryZip,fileName) =>
let
//shorthand
UInt32 = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32,ByteOrder.LittleEndian),
UInt16 = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16,ByteOrder.LittleEndian),
//ZIP file header fixed size structure
Header = BinaryFormat.Record([
MiscHeader = BinaryFormat.Binary(14),
CompressedSize = UInt32,
UncompressedSize = UInt32,
FileNameLen = UInt16,
ExtraFieldLen = UInt16]),
//ZIP file header dynamic size structure
FileData = (h)=> BinaryFormat.Record([
FileName = BinaryFormat.Text(h[FileNameLen]),
ExtraField = BinaryFormat.Text(h[ExtraFieldLen]),
UncompressedData = BinaryFormat.Transform(
BinaryFormat.Binary(h[CompressedSize]),
(x) => try
Binary.Buffer(Binary.Decompress(x, Compression.Deflate))
otherwise null)]),
//Parsing the binary in search for PKZIP header signature
ZipIterator = BinaryFormat.Choice(UInt32, (signature) => if signature <> 0x04034B50
then BinaryFormat.Record([FileName=null])
else BinaryFormat.Choice(Header,(z)=>FileData(z))),
ZipFormat = BinaryFormat.List(ZipIterator),
out = List.Select(ZipFormat(binaryZip), each _[FileName]=fileName)
in
out{0}[UncompressedData],
Source = Unz(Unz(File.Contents(Filename),"DataMashup"),"Formulas/Section1.m"),
Custom1 = Lines.FromBinary(Source),
#"Converted to Table" = Table.FromList(Custom1, Splitter.SplitByNothing(), null, null, ExtraValues.Error)
in
#"Converted to Table"

view raw

MQueriesPBIX.pq

hosted with ❤ by GitHub

This code is a variation of Igors function which retrieves the code from an opened pbix-file. So now you can apply it to closed pbix-files as well.

For method 1 you call it like so (as it takes the full string for the file-path as its parameter):

MQueriesPBIX([Folder Path]&[Name])

And for method 2 like so:

MQueriesPBIX([Column1])

This returns a table with one row per code-line.

Function to identify query- & stepnames

The following function processes this further and adds columns with the query- & step-names for further analysis:


(PQTable as table) =>
let
#"Added Index" = Table.AddIndexColumn(PQTable, "Index", 0, 1),
#"Duplicated Column" = Table.DuplicateColumn(#"Added Index", "Column1", "Column1 – Copy"),
#"Split Column by Delimiter" = Table.SplitColumn(#"Duplicated Column", "Column1 – Copy", Splitter.SplitTextByEachDelimiter({"="}, QuoteStyle.Csv, false), {"Part1", "Code"}),
#"Trimmed Text" = Table.TransformColumns(#"Split Column by Delimiter",{{"Part1", Text.Trim}}),
QueryName = Table.AddColumn(#"Trimmed Text", "QueryName", each if Text.Start([Column1], 6) = "shared" then if Text.Start([Column1],8)="shared #" then Text.Range([Column1],9, Text.Length([Part1])-8) else Text.Range([Column1], 7, Text.Length([Part1])-7) else null),
StepName = Table.AddColumn(QueryName, "Stepname", each if [Part1]="in" or [Part1]="let" or QueryName{[Index]-1}[Part1]="in" or [QueryName] <> null or Text.Start([Part1],2) = "//" then "" else Text.Trim([Part1])),
#"Filled Down" = Table.FillDown(StepName,{"QueryName"})
in
#"Filled Down"

view raw

MetaQueriesPBIX

hosted with ❤ by GitHub

You call it within an added column again, with the name of the previously created column containing the code (“Code”) like this:

MetaQueriesPBIX([Code])

This would be much easier, if we had a proper API like requested here: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/7345565-power-bi-designer-api 

That API would also enable us to bulk-retrieve other useful information from the file like everything about the DAX data model like it is currently possible with .bim-files from tabular models hosted on SSAS (blogpost with details to follow). So please vote for that feature if you find it useful as well!!

Enjoy & stay queryious 🙂

Comments (9) Write a comment

  1. Thanks for sharing this! So useful. Have you had any success with extracting all the DAX from a file?

    Reply

    • I didn’t manage to extract the DAX from closed PBIX-files yet.
      If your file is open, you can DMVs via DaxStudio or some M-code in the query editor to extract DAX-definitions to extract the DAX from the model.
      Another option is to save your pbix as a template (pbit) or to migrate it to SSAS in Azure: This will produce a very nice JSON-format with all your DAX.
      Please let me know if you need more Infos on any of the methods mentioned.
      Cheers, Imke

      Reply

  2. Pingback: SSRS APIs, M Queries, Power BI Desktop and more | Guy in a Cube

  3. Oh, that’s great !
    I cannot succeed to make Igor’s solution to work. There is a problem with binary encoding.
    Would you please adapt your solution to work with the opened PBIX file ?
    Thanks a lot

    Reply

    • Hi Didier,
      you can use my function to access the currently opened PBIX as well.
      Just remember that you will see the last saved version then.
      Cheers, Imke

      Reply

  4. Pingback: Bulk-extract Power Query M-code from multiple Excel files at once – The BIccountant

Leave a Reply