Export data from Power BI to csv using Python

In this blogpost I show you my M-Python-function that I use to export data from Power BI to csv files (Export Python).

Why Python?

I prefer it to R mostly because I don’t have to create the csv-file(names) in advance before I import data to it. This is particularly important for scenarios where I want to append data to an existing file. The key for this task is NOT to use the append-option that Python offers, because M-scripts will be executed multiple times and this would create a total mess in my file. Instead I create a new file with the context to append and use the Import-from-folder method instead to stitch all csvs back together. Therefore I have to dynamically create new filenames for each import. So when the M-Python-scripts are executed repetitively here, the newly created file will just be overwritten – which doesn’t do any harm.

Caveats

  • As with R, date-formats will not be recognized correctly when imported into the engines, so my function transforms all columns to text before passing the data frame to the Python-script.
  • Single numbers (like 10) will be converted to decimals (10.0) by the Python-engine. When you import this file back to Power BI and transform the column to number, this decimal place will disappear again. But if you want to use the data somewhere else, you should be aware of this.
  • Leading zeroes (“0001”) will be deleted (“1″). I haven’t found a way around this yet, apart from prefixing it with a string like a single quote (” ‘ “) and remove that when re-importing.

Function code


let func =
(SOURCE as table, PATH as text, optional mode as text, optional index as logical, optional header as logical, optional quoting as text, optional chunksize as number, optional decimal_as_point as text) =>
let
Source = Table.TransformColumnTypes(SOURCE, List.Transform(Table.ColumnNames(SOURCE), each {_, type text})),
Path = PATH,
index = if index = null then "None" else "True",
header = if header = null then "True" else "False",
quoting = if quoting = null then "csv.QUOTE_ALL" else quoting,
chunksize = if chunksize = null then "None" else chunksize,
//Edit by RonaiBertalan (https://gist.github.com/RonaiBertalan):
decimal_as_point = if decimal_as_point = null then "'.'" else "','",
CleanedPath = Text.Replace(Path, "\", "/"),
Custom1 = Python.Execute("# 'dataset' holds the input data for this script#(lf)#(lf)import csv #(lf)import pandas as pd#(lf)path = r'"& CleanedPath &"'#(lf)#(lf)dataset.to_csv(path, mode = 'w', index = " & index & ", header = " & header & ", quoting = " & quoting & ", chunksize = " & chunksize & ", decimal= " & decimal_as_point & ")",[dataset=Source])
in
Custom1 ,
documentation = [
Documentation.Name = " Table.ExportCsvPyhton.pq ",
Documentation.Description = " Exports table to csv using Python-script. ",
Documentation.LongDescription = " Exports table to csv using Python-script. Defaults to: index-None, header-True, quoting-None, chunksize-None. ",
Documentation.Category = " Table ",
Documentation.Source = " ",
Documentation.Version = " 1.1: QUOTE_ALL ",
Documentation.Author = " Imke Feldmann: www.TheBIccountant.com. ",
Documentation.Examples = {[Description = " ",
Code = " ",
Result = " "]}]
in
Value.ReplaceType(func, Value.ReplaceMetadata(Value.Type(func), documentation))

I’ve included some optional parameters, whose default values are this:

  • index: None, which means that no additional index-column will be created during export. Any value that you fill in here will make an index-column be created.
  • header: True, which means that by default the header-row will be created. Again, any value in my function parameter will make that header disappear.
  • quoting: csv.QUOTE_NONE, but you can use other options as well (see the docs: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html) .
  • chunksize: none, but you can use any number to chunk up the export process.

Also the slashes in your filenames will automatically be reverted from “/” to “\” so that you don’t have to take care of this manually.

At the time of writing these scripts can be refreshed in the service with the personal gateway, but not with the enterprise gateway yet.

Wondering how to apply the function? Please check out this video on how to export data from Power BI to csv using the Python script:

Enjoy & stay queryious 😉

Comments (46) Write a comment

  1. Pingback: Combining M and Python To Export Power BI Data To CSVs – Curated SQL

  2. Thank you for this awesome function to write CSV-s! I just want to add that it took some time for me to figure out that the input to the path parameter should look like this: C:\Data\filename.csv
    also I think it can’t work with Anaconda because pandas does not have the right to call numpy
    – “Missing required dependencies {0}”.format(missing_dependencies) –
    so I had to install Python 3.7 and install pandas. Sincerely: Bertalan

    Reply

  3. Pingback: Export large amount of data from Power BI desktop visuals –

  4. I tried this calling a View on my azure sql server but didn’t get through. It gave me: “Formula.Firewall: Query ‘Invoked Function’ (step ‘Source’) references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.”.

    So, I copy-pasted the View code into a DirectQuery in PBI but still gave the same error. I’m assuming it could be because my View source code has multiple sub-queries, unpivots, etc.

    Just wondering if these considerations were made during the design?

    Reply

    • I usually disable the security settings for those applications. Not sure how it works with DirectQuery though.

      Reply

  5. Whenever we try to put this function code above in our python script editor in power bi, we get the syntax error: ”’PythonScriptWrapper.PY” line 20
    let func =
    ^

    Does anybody know how to solve this?

    Thanks in advance!

    Reply

    • Thanks for your reply, however in the video they use the m-code for a blank query/table. In my situation we have a table that is a result from different powerquery’s and because of that we can’t select it as a ”SOURCE” when we need to fill in our parameters. Do you have any idea how to fix this?

      Reply

        • When we want to put in the m-python code in the advanced query editor it doesn’t work because the table we want to export is derived from a numerous of other queries. So when we use the m-python code we get a syntax error.

          Reply

          • If my understanding is correct, an intermediate table would be a good solution in that case:
            Create a table that consolidates what shall be exported “from numerous other queries” to one table before feeding it to the script.
            /Imke

  6. Thanks for replying! If i make an intermediate table using DAX then it isn’t possible to take that table to powerquery editor right?

    Reply

    • Yes, once you’re in DAX, there is no way back to Power Query.
      But if your table is not too large, you could run the Python-script from a visual instead (check out documentation about the current sampling figure !!). You have to use only Python in there. A very simple implementation without any parameters would look like so:
      import csv
      import pandas as pd
      path = r'C:/YourFilePathGoesHere/TestExport.csv'
      dataset.to_csv(path)

      Where “dataset” is referencing the fields that you’ve dragged into the visual.
      But again, sampling might occur!!

      Reply

  7. When we use this as a python visual it works perfectly in the desktop version but when in power bi service we get the error that it cant find our path location. Could you please upload a video or explain how we can get this working in the power bi service because we keep getting

    Reply

  8. I have a simple table in Power bi and i’m trying to export the data in it to excel using this code. I have kept the privacy setting to Public and still receive the error “Formula.Firewall: Query ‘Invoked Function’ (step ‘Source’) references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.” Could you please tell me how to resolve this?

    Reply

  9. Hello This function works very well for me but I have an issue I have in one column this text 12/5-1 that I want to keep as a text, but when opening the .csv this text is automatically convert into a date 12/05/2001 how to solve this issue

    Reply

  10. Hi…I am getting below error after pasting the code in Advanced editor:

    An error occurred in the ‘’ query. Expression.Error: The name ‘Python.Execute’ wasn’t recognized. Make sure it’s spelled correctly.

    Does someone knows how to solve this?

    Thanks in advance!

    Reply

  11. I get this error when I tried to run the query

    Formula.Firewall: Query ‘Invoked Function’ (step ‘Source’) references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.

    I searched the error and it suggested setting up a staging query which I did but it still does not work. Any ideas?

    Reply

  12. This is really useful information, and I got the export to csv functionality working. How would we take this script and customize it to export to Excel (xlsx)? I see there is a function in the Pandas library, but I’m struggling to get the syntax right in the Power BI editor.

    Any help would be very much appreciated. Thank you.

    Reply

    • Hi Riley,
      I don’t have a xlsx-script at hand, so would have to google it as well.
      Are you sure that Export to Excel is what you need? How about a Power Query in Excel that imports the csv-files? (then you could do some consolidation as well.
      /Imke

      Reply

      • Hi Imke,

        I thought about that as well, but that actually increases the complexity of what I’m trying to do. My goal is to run a number of queries in Power BI for populating mail-enabled security groups locally (the syncing of the csv files to AD groups will still take place with a PowerShell script). Some of the query-based security groups require exceptions, and we’re managing that with a small PowerApp that writes to a CDS entity.

        For example, we have a query that pulls all doctoral students from an Oracle database. We have a couple of student examiners that need to be on that list as well. Since they are an exception to the query, we have them added to the CDS entity. The PBI queries bring the CDS data together with the Oracle data to output the finalized CSV files. Lastly, I want an Excel workbook that has all the data appended back together for use as a data source within PowerApps. This is why I’m trying to get the Python script to export to xlsx. I know I could do it with another PQ and point it at the folder containing all the CSV files, however; this process already has a lot of moving parts, and I would prefer to not setup yet another scheduled report in the PBI service.

        Reply

          • I figured it out late yesterday, and that site actually helped me as I was searching for an answer. The main thing that I was missing was that I also needed to have the Openpyxl Python library installed. Once I did that, the to_excel function worked as expected.

            Thanks again for the help on this one, and I hope this helps someone else.

  13. Thank you very much for the useful code snippet. I would very much like to use this in Power BI to automatically back-up some (very large) data tables as the last part of the queries that create them. Then, instead of pulling all data from our API, I would like to load the locally saved data into Power Query again, and merge it with the newest data pulled from the API. So, in effect an incremental data refresh instead of loading everything every time. Could you please guide me in the right direction? Cheers, Mike

    Reply

  14. does this export any amount of rows? or does it have the normal PowerBI export?

    Reply

  15. Hi Imke,
    Have you ever tried to get this to work from the service using schedule refresh?
    It seems there are several issues besides the need for personal gateway to be set up on the same machine. Python is not supported via enterprise gateway currently.
    Cheers
    Pat

    Reply

    • Hi Patrick,
      I have been running it in the service just for testing purposes.
      Yes, the limitation with regards to enterprise gateway is not very encouraging, unfortunately.
      /Imke

      Reply

      • Thanks Imke,
        FYI We have an active case with MS PBI support regarding the loss of credentials to the python data source via the personal gateway. Desktop runs the script locally and writes the csv fine. Looks like they are chasing something that it out of sync between the service and the local machine via the GW.
        Cheers
        Pat

        Reply

  16. Hi Imke,

    Thanks for the code, I’ve implemented it successfully for a couple of months now.

    Now I’ve come to a point that the exported csv file is quite big.

    Is there a way to export the output in several CSV’s based on row count?

    Thanks again,

    sunnysideup

    Reply

    • Hi sunnysideup,
      haven’t tried that.
      One could create a nested table in Power Query and add a column to call this function or try to find a Python-script to do the split.
      This depends on the actual bottleneck/reason that the increasing size brings.
      /Imke

      Reply

  17. Hi All,

    Is it possible to use Sharepoint folder as a path without mapping this Sharepoint as drive? Any advice how to do it correctly?

    Thanks in advance

    Reply

  18. One important issue is that if you find yourself searching for a education loan you may find that you will need a cosigner. There are many circumstances where this is true because you should find that you do not have a past credit score so the loan company will require that you have someone cosign the credit for you. Thanks for your post.

    Reply

  19. Hi, maybe this will help. “Power Bi Exporter” addon seems to do the job without a need to use python.

    Reply

  20. An error occurred in the ‘Query1’ query. DataSource.Error: ADO.NET: A problem occurred while processing your Python script.
    Here are the technical details: [Expression.Error] We cannot convert a value of type List to type Text.
    Details:
    DataSourceKind=Python
    DataSourcePath=Python
    Message=A problem occurred while processing your Python script.
    Here are the technical details: [Expression.Error] We cannot convert a value of type List to type Text.
    ErrorCode=-2147467259
    ExceptionType=Microsoft.PowerBI.Scripting.Python.Exceptions.PythonUnexpectedException

    My dataset having lots of null values as well as have list in one column can we export ignoring the datatype

    Reply

  21. hi there the chunk part does not work as it suppose to. I got the following error when invoking the function

    An error occurred in the ‘Python Extract To Text’ query. Expression.Error: We cannot apply operator & to types Text and Number.
    Details:
    Operator=&
    Left=# ‘dataset’ holds the input data for this script

    import csv
    import pandas as pd
    path = r’C:/Users/236473/Downloads’

    dataset.to_csv(path, sep = ‘|’, index = False, chunksize = ‘

    Any idea?

    Reply

Leave a Reply