Filter SQL-server queries with Excel-tables: Query folding limitations and bug workaround

Power Query provides excellent functions to filter your SQL-Server queries by complete Excel-tables instead of just single parameters (like in all other Excel-based approaches). But you need to take care that the filtering is done at the server – before the data is loaded to Excel. This has a huge impact on performance that you don’t want to miss and is called “query folding”. Koen Veerbeck’s article provides a very good overview, links to other usefull resources on this topic as well as a list of limitations for query folding to happen.

Recently I ran into a problem where folding seemed to stopp randomly: Used a table to filter and with some entry sets it folded and some other entries it didn’t. I received very good help from Microsofts Dev-&Support Team on this issue which I thought was worth sharing: They pointed out a bug in the current heuristic that is used to balance load times. This bug means that queries will stop folding now if the number of rows in the Excel-filtertable exceeds 200 rows:

So if the heuristic would work as planned, query folding would take place, if:

  1. There are between zero and 100 non-SQL rows: So for short lists: Start folding immediately
  2. OR the data sparsely matches the first 200 rows of the SQL side (< 2%): So have a quick look into the first 200 rows of the SQL table and if the number of rows that match with the filter list is less than 2%, also start folding.

This looks very reasonable in my eyes – meaning that query folding would happen in most of the cases. But now there is this bug in the second case: It will stop folding if the non-SQL-filter-list (your Excel-table) is longer than 200 rows. Which is anything but unusual.

As long as the bug is there, you can use my function that overcomes this limitation by:

  • autmatically spliting up the filter table into chunks of 200 lines
  • performing the filter operation in a function and
  • stitching the results back into one table

Query folding will happen, so this is the way to go for filter-lists longer than 200 rows at the moment.

let
// 2 Performing the filter operation in a function
function = (group)=>
let
Quelle = Sql.Databases("SQL_Server_Name"),
Database_Name = Quelle{[Name="Database_Name"]}[Data],
dbo_Data = Database_Name {[Schema="dbo",Item="Daten"]}[Data],
Result = Table.Buffer(Table.NestedJoin(dbo_Data,{"KeyId"},Table.SelectRows(FilterTable, each [Group] = group),{"KeyId"},"NewColumn",JoinKind.Inner))
in
   Result,

// 1 Splitting filter table ("Tabelle1") into chuncks of 200 lines and add index
Quelle = Excel.CurrentWorkbook(){[Name="Tabelle1"]}[Content],
ChgType = Table.TransformColumnTypes(Quelle,{{"KeyId", type number}}),
AddIndex = Table.AddIndexColumn(ChgType, "Index", 1, 1),
FilterTable = Table.AddColumn(AddIndex, "Group", each Number.RoundUp([Index]/200)),

// 3 Main query: Create table with groups to call function from, call function and expand (thereby stitching the results back together)
Source = {1..Number.RoundUp(List.Max(FilterTable[Group]))},
ConvertToTable = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
AddFunctionCol = Table.AddColumn(ConvertToTable, "Custom", each function([Column1])),
ExpandResults = Table.ExpandTableColumn(AddFunctionCol, "Custom", {"KeyId", "Datum", "Wert"}, {"KeyId", "Datum", "Wert"})
in
   ExpandResults

Bold texts need to be replaced by the parameters of your SQL source and Excel-filter-table.

Edit: You’ll find a follow-up article here with an improved code for the workaround: http://www.thebiccountant.com/2017/07/20/sql-query-folding-bug-still-alive-sucking-powerbi-powerquery-excel/

There is currently no interest at Microsoft to change the current behaviour: http://community.powerbi.com/t5/Issues/Bug-Query-folding-not-working-with-non-SQL-datasource/idi-p/225100#M11611 .

I found that this workaround works also pretty well (faster in most cases), as long as your filter-tables are not too long: http://community.powerbi.com/t5/Desktop/Parameterized-SQL-Query-with-query-folding/td-p/171503

Enjoy & stay queryious 🙂

Comments (14) Write a comment

  1. Is this still a problem with the

    Version: 2.33.4337.481 5/11/2016 version of Power Query?

    Reply

  2. It seems that this is still a problem. I’m just getting started in Power Query, and I noticed that folding didn’t work properly if I have a starting table with 250 records and tried to merge that with a different Oracle table containing many thousands of records. If I reduce the number of original records below 200 the query takes only a few seconds to run.

    I haven’t seen any information from Microsoft as to when this might be fixed. Curious, if you (or anyone else) has any more information about when this will be fixed?

    Thanks.

    Reply

    • Thanks for the feedback. Passed the question along and will update here once I got feedback.

      Reply

  3. Any update on this problem? It would be nice to know if MS is actually working to resolve this bug. A few of the projects that I’m working on require the use of Excel and Power Query, and this limitation is continuing to slow us down.

    Any update would be appreciated. Thanks.

    Reply

  4. Hi Imke

    Just found this really useful

    What is the basic way to filter SQL Queries by Excel tables that you refer to at the very begining?

    I was trying Merge Queries with an Inner Join but that’s super slow on bigger data sets

    Reply

    • Thank Wyn,
      sorry for my bad English: I wanted to say that in all other ways to query SQL-tables from within Excel, you have to “hardcode” your filters and cannot (dynamically) refer to a table or range that sits somewhere in your Excel workbook (like MS Query for example )

      Check out the links at the end of the article: They have faster or easier methods than the one described here!

      Reply

Leave a Reply