RegEx in Power BI and Power Query in Excel with Java Script

From time to time Huang Caiguang sends me some geeky M-code that turns out to be very useful (like this for example). Today it’s utilizing java script code for RegEx (regular expressions): The Web.Page function can execute JS code and you can pass strings from you M-environment to it with simple string concatenation. So escape the js-code and use the ampersand (“&”) to reference the string. As a function it looks like so:

let
fx=(x,y)=>Web.Page(
“<script>
var x=”&x&”;var y=new RegExp(‘”&y&”‘,’g’);var b=x.match(y);document.write(b);
</script>”)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
in fx(“””hello012中国1235″””,”\\d+”)

This code extracts all decimals (“\\d+”) from the string provided as the first argument and concatenates them with a comma. But just try for yourself by pasting the code into the advanced editor.

This doesn’t work in PowerBI service, so if you need it there, you can switch to an R-script that can be used with a personal gateway. But for Excel users, this is the way to go.

Please vote for a native implementation in M here: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/8892295-regular-expressions-support-in-power-query

Enjoy and stay queryious 😉

Comments (32) Write a comment

  1. Too complicated =)
    ====
    let
    fx=(s, reg)=>Web.Page(“document.write(‘”&s&”‘.match(/”&reg&”/));”),
    tbl = fx(“qwer1234asdf”,”\d+”),
    result = tbl{0}[Data]{0}[Children]{1}[Children]{0}[Text]
    in
    result

    Reply

  2. Pingback: Power BI Lingo, RegEx, Data and BI Summit and more... (April 26, 2018) | Guy in a Cube

  3. I am trying to do something similar, but I am too new to Power Query / M to get anything working.

    I have a table with a column called Title with some text. I am trying to find any row that has Visio as a word within the title, but exclude anything such as vision, visionary, …

    I can’t get the example above to work. Is it possible to provide a sample file with a custom function, along with the appropriate parameters ([Field] as text, RegEx), returning true or false (if found or not found).

    Reply

    • Hello Daniel,
      if my understanding is correct, this is fairly easy to implement. Please paste the following code into the advanced editor and see what happens:

      let
      Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCssszsxXSCxOSVPSUUosVorVgYrlwQTTk8CCIJ5CGUgGKFaclQYVzIGJ5hWDDEgtqkpUio0FAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Title = _t, #"Other columns" = _t]),
      #"Filtered Rows" = Table.SelectRows(Source, each List.Contains(Text.Split([Title], " "), "Visio", Comparer.OrdinalIgnoreCase))
      in
      #"Filtered Rows"

      This is a case-insensitive implementation. If you need a case sensitive, then use Comparer.Ordinal instead.
      /Imke

      Reply

  4. Thanks for the quick reply.

    The following row from your code seems to return a table with any row containing visio (case insensitive). I would like to see this in a custom column with a TRUE (contains visio) or FALSE (does not contain visio).

    Table.SelectRows(Source, each List.Contains(Text.Split([Title], ” “), “Visio”, Comparer.OrdinalIgnoreCase))

    I want to provide this to some employees who can then simply filter on a few columns (I will have a column for Visio, Project, Adobe, Antidote, …

    It would be cool, if those could all be in the same column with the software as the result instead of true/false, but that might be a phase 2 – I would be very happy to have the 1st part working. The difficulty with phase 2 is that I may have multiple software in the same title (ie: Install visio and project). In phase 1, I plan on doing this as multiple columns with visio true/false and project true/false). Phase 2, I would like “VISIO,PROJECT” as the result in a custom column.

    Reply

    • Hi Daniel,
      phase 1 is no problem, just take the filter logic to a new column instead: Table.AddColumn(Source, “FilterColumns”, each List.Contains(Text.Split([Title], ” “), “Visio”, Comparer.OrdinalIgnoreCase))
      I don’t have a clear picture what you want to achieve in phase 2. So please provide a link to an Excel file where you’ve outlined the situation as is and the desired outcome.
      Thanks, Imke

      Reply

  5. Hi, I managed to do what I wanted using multiple columns. I have provided a number of rows below to show what I would like.

    Sample Data…Column Title is what we check and Custom_Software_Column would return specific words found in the title column separated by a comma or other delimiter.
    (At the moment I am specifically looking for Visio, Project, Adobe, Antidote but this could change).

    Title Custom_Software_Column
    Install Visio Visio
    Install Visio and Project Visio, Project
    Other issue1
    Install Project Project
    Other issue2
    Remove Antidote Antidote

    I managed a solution by using your code (one column per keyword) and then using another column with “if then else” logic. The is probably a better way that would eliminate the interim columns, but the solution works.

    Reply

  6. I faced an issue with special non printable characters in the input strings, I removed them from the input string first using Text.Clean function to bypass it:

    let
    fx=(InputString,RegexPattern)=>
    let
    CleanText = Text.Clean(InputString),
    Match = Web.Page(

    var x=(‘”&CleanText&”‘);var y=new RegExp(‘”&RegexPattern&”‘,’g’);var b=x.match(y);document.write(b);

    “)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
    in Match
    in fx

    Hope this helps.

    Reply

  7. Pingback: Regexp in Power Query using JavaScript - Tutorial Guruji

  8. Pingback: [SOLVED] Regexp in Power Query using JavaScript – BugsFixing

  9. Has anyone successfully implemented regex in power query (BI) M language using the .replace instead of just the .match

    Reply

  10. I got the .replace to work using grouping and replacing with groups, the code for the function is below:

    et fx=(text,regex,repl)=>
    Web.Page(

    var x='”&text&”‘;
    var z='”&repl&”‘;
    var y=new RegExp(‘”&regex&”‘,’g’);
    var b=x.replace(y, z);
    document.write(b);

    “)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
    in
    fx

    In PQ a M code example of the usage is below:

    #"Added Custom4" = Table.AddColumn(#"Added Custom3", "Clean_First", each Regex_Replace([First], " \\d.\\d+| \\d", " "))

    in
    #”Added Custom4″

    Reply

    • How to overcome this error –> Expression.Error: The name ‘Regex_Replace’ wasn’t recognized. Make sure it’s spelled correctly.

      Reply

  11. That error is because you first have to create function with the code and name the function Regex_Replace
    Then you can call the function as in the example m code

    Reply

  12. I tried this approach, but the result is a piece of code and not the expected string value. 🙁

    Reply

  13. on the first line of the code the “L” in let was omitted when copying the code for the Function

    Reply

Leave a Reply