Incremental load in PowerBI

Edit 2017-Jan-13: You’ll find a more recent version that doesn’t use hacks for incremental load here: http://www.thebiccountant.com/2017/01/11/incremental-load-in-powerbi-using-dax-union/

If you want to perform incremental load of your data in PowerBI, you should vote for it here.

Please vote before you read on to see which hacky workaround to use until this is implemented.

This workaround is a compromise that’s mainly suited for use cases like:
– performing extensive transformations on imported data only once
– saving data in the model that will not be available at the source later on (like many web-imports)

It works on the principle of self-referencing-queries like I’ve described here in combination with the hack of your current PBI-datamodel which I’ve described here (that’s fully automated now).

So the data that should “sit in the model” and just be merged with new incoming data will actually be imported into the model again. Just not from it’s original source, but instead from itself. As this is often slower than to import from the original source, you will have to look closely into your specific need and decide if this makes sense for you.

These are the steps to perform:

1) Create 1st query that imports and transforms the first chunk of data that shall not be refreshed afterwards and load it to the data model
2) Go back to the query designer and create a 2nd query that re-imports 1st query using the function below
3) Create a 3rd query that appends query2 to query1 and load it to the model
4) Go back to the query designer and choose 1st or 3d query to implement a filter or remove-duplicates step that avoids importing data that has already been imported
5) Change the source in the 2nd query from query1 to 3rd query (as you now want to reference everything that has been imported so far).
6) Disable load of the first two queries

Function PBI.Model: It connects to your current PowerBI Model fully automatically, you will just be prompted to confirm your connection to the SSAS-instance everytime you reopen the file.

(optional Dax_Query as text) =>

let
 Source_Port = Folder.Files("C:\Users"),
 msmdsrv = Table.SelectRows(Source_Port, each [Name] = "msmdsrv.port.txt"),
 Select_Last_Instance_Active = Table.FirstN(msmdsrv,1),
 #"Combined Binaries" = Binary.Combine(Select_Last_Instance_Active[Content]),
 #"Imported Text" = Table.FromColumns({Lines.FromBinary(#"Combined Binaries",null,null,1252)}),
 PortID = Table.TransformColumns(#"Imported Text",{},Text.Clean){0}[Column1],
 Database = AnalysisServices.Database("localhost: "&PortID, "", [Query="select * from $SYSTEM.DISCOVER_PROPERTIES"]){0}[Value],
 PBI.Model = if Dax_Query <> "" and Dax_Query <> null then AnalysisServices.Database("localhost: "&PortID, Database,[Query=Dax_Query]) else AnalysisServices.Database("localhost: "&PortID, Database)[Data]{0}[Data]{0}
 in
 PBI.Model

The underlying technique on how to dynamically retrieve the PortID via a file on your local drive is described here.

Actually, I thought that one could also use a different method, taking advantage of the disabling of  “Include in Report Refresh”. But while this option will stop the re-load of queries which are directly loaded to the data model, the data will actually be refreshed once you reference this query by another query (which would normally be your append-operation).

RealtimeCharts-1.zip

 

Yes, this is pretty hacky, so please vote & stay queryious 🙂

Short video:

Comments (10) Write a comment

  1. Hi Imke. I’m unable to figure out in the steps given to achieve incremental data load, which query imports the delta load from database. You have mentioned 2nd query just re-imports 1st query’s data and 3rd query just appends query2 to query1.

    Reply

    • Yes, 2nd query starts with importing 1st query. But in step 5 you need to change that source to 3rd query. That’s the crux.
      You cannot start with defining it like this, because at time of creating query 2, there is no query3. On top of that: You need to load query3 to the model once – otherwise the self-reference to the model will not find it.

      Might put together a video in the evening showing the steps. But you can check the attached file. It’s now updated with figures from this morning. So everytime the URL will release new figures you can catch them in PBI and append those data.

      Reply

      • A video would be immensely helpful to a lot of people who deal with big sized Corporate databases. Thanks a lot.

        Reply

  2. To give a standard corporate database example, let us say I’m querying SQL Server data tables which are huge in size and importing entire data is virtually impossible. Therefore, the date filter needs to be given for the same. Let us assume that the data requirement is for the last 5 years and refreshing entire 5 years data is highly time consuming. So, as per your earlier posts, first 4 years data can be imported as a one time import and saved as CSV locally using R Script and for the 5th Year a new query for extracting latest data and followed with an append operation. Now, the question arises on how can we make the same fully automated in which the latest data load query is aware of the existing CSV data (let’s say on last date) and extracts only the data which is not available in the CSV file.

    Reply

  3. Hello,

    I’m totally new to power BI. I’m working on small experiment in the lab. I have data source from googlesheet connected to powerBI. This source of data is daily updated, unfortunately powerBI doesn’t do incremental data load, as it seems to show only the last data available data on data source. I’ve tried your video, but does not seem to work. Honestly, i have no idea, i just copied what you did. Can you elaborate more on step 2 above ? which function that you refer to ?

    Reply

    • Hello John,
      you’ve picked quite a demanding topic for a beginner 😉
      The function I’m referring to in step 2 is the code that you see at the bottom of this blogpost, beginning with “(optional Dax_Query as text) =>”
      You should be able to see it in the download-file as well.

      Reply

      • Hello Admin,

        I did the second step, but it does not load from googlesheet. Did i miss anything ? 🙁

        Reply

        • I’m sorry, but this can be a bit demanding for beginners.
          Hard to tell from the distance what’s actually going wrong.

          Reply

Leave a Reply