Import text from pdf files in Power BI

While Power BI will soon provide functions to import tables from pdf-files, there might be occasions when you actually need to import text from pdf files (in unstructured form). With a little help from R in Power BI you can do exactly that. (And don’t worry: No need to learn R here: The necessary R-code is already included in my function below. All you need is to have R installed your machine). Please also note that at the time of writing the refresh of these queries in the service is only supported with the personal gateway and not with the enterprise version.


You can use the function below just like a normal M-function, just pass the (URL- or file-) path to it. All you have to take care of is that a instance of R is running on your machine. If this is new to you, check out Ruth Pozuelo’s video showing all the necessary steps: How to install R for Power BI

There is one package required: pdftools. The video above also shows how to install it.


Import text from PDF files:

You can try calling this function for a pdf-file from the internet like the M formula language specification like this:


If you want to import local files from your computer, just paste the full file-path instead of the URL. You don’t have to care about the direction of the slashes, both versions (forward and backwards) are accepted.

How to use

The script will return a table with one row for each page in the pdf-file by default. But it has an optional 2nd parameter that will return one row per pdf-text-line instead, if you put 1 into it. A page index and a row index will help navigating the result.

The 3rd parameter is an optional owner password for the pdf and the 4th the optional user password. If you’re using them, you have to enter null for the previous optional parameters. The following example shows how to use a user password while leaving the others “empty”:

ImportPdfText("MyPdfPath",null, null, "MyPassword")

Enjoy & stay queryious 🙂

Comments (8) Write a comment

  1. Pingback: Reutilizar funciones en Power Query – Power BI y Business Intelligence

  2. @Imke – Thanks – this would be a big help.
    Can a Password be passed as a parameter if the pdf file is password protected


  3. This works really well. Thank you, Imke.
    And the youtube video by @Curbal is a nice icing on the cake.


  4. Hi, i tried the function but couldn’t make it run correctly.
    R scripts are enabled and pdftools are installed.
    When the function is invoked, a null reference error is generated.
    Can you please provide guidance?


    • Did you use one of the optional parameters and forgot to enter null for the preceeding ones that you didn’t use?
      Otherwise I need more details: Please post your exact syntax and the full text of the error message please.
      Thx, Imke


  5. Pingback: PDF Tabellen in Power Query einlesen - PowerBI Pro

Leave a Reply