A typical task when cleaning data is to extract substrings from string that follow a certain pattern. In this post I’m going to describe a method that uses the List.Accumulate function for it. Extract a pattern string.
I have to extract a payroll key from a description field that starts with 8 number, followed by an “-” and another number.
So I’m after the 12345678-0.
I plan to approach this by
Pivot operations in are a very handy feature in Power Query but they can slow down refresh performance. So with some bittersweet pleasure I can tell that I found a trick to speed them up. The sweetness comes from the fact that the performance improvement is very significant. But the bitterness comes from the fact that I could have used this for almost 4 years now, but was too blind to realize at the time when I first worked with the code.
Trick to speed up a slow pivot table
This might not work everywhere, but for my tests, it worked really well: Don’t use an aggregation function when you want fast pivoting:
Don’t aggregate when you want a fast pivot in Power Query
But if your data isn’t aggregated on the row- & column values already, you’ll get this error message:
Error when the values are not adequately aggregated
So to make this work, you have to aggregate the values on the axis’ values before.
Let’s walk through the steps:
When you use an API with a paging mechanism (like the example from this blogpost), you’ll might work with a field that contains the address for the next page. You can use this to walk through the available chunks until you reach the last element. That last element in the pagination will not contain a next-field or that field will be null.
Paging in Power Query
In Power Query you can use the function List.Generate for it. According the latest function documentation it:
Generates a list of values given four functions that generate the initial value
initial, test against a condition
condition, and if successful select the result and generate the next value
So an intuitive implementation would look like so:
Initial code for paging: Will miss the last element
In the initial step (row 2) the API will be called and returns this record:
Today I want to share quick tip on how to automatically detect and change all column types at once in Power Query.
Very often, when you expand a column in Power Query that contains a table or some records, the expanded columns will lose their types (like Chris Webb has described here for example). Or you might just have accidently deleted a “Changed Type”-step.
No types on columns
Did you know there is actually a superfast and easy way to do it?
- Click the mouse anywhere in the table
- Press Ctrl + a (check all)
Check the whole table with Ctrl + a
- Go to the Transform-tab ad choose: “Detect Data Type”
Transform with 1 click
Voila: All your columns should have types on them.
They have been automatically been detected by checking the first 100 rows of your table. So if you know that you’re having columns with inconsistent values in them, make sure to check the automatically assigned values.
Enjoy & stay queryious 😉
If you’ve been following my blog for a while, you might have noticed my interest in incremental load workarounds. It took some time before we saw the native functionality for it in Power BI and it was first released for premium workspaces only. Fortunately, we now have it for shared workspaces / pro licenses as well and it is a real live saver for scenarios where the refresh speed is an issue.
However, there is a second use case for incremental refresh scenarios that is not covered ideally with the current implementation. This is where the aim is to harvest and store data in Power BI that will become unavailable in their source in the future or one simply wants to create a track of changes in a data source. Chris Webb has beaten me to this article here and describes in great detail how that setup works. He also mentions that this is not a recommended setup, which I agree. Another disadvantage of that solution is that this harvested data is only available as a shared dataset instead of a “simple” table. This limits the use cases and might force you to set up these incremental refreshes in multiple datasets.
Edit October 2020: This feature is now generally rolled out, so there is no way to go back to the old format any more. From what I’ve heard, the problems described in the article below don’t occur any more.
This is just a quick heads up for the new preview feature “Store datasets in enhanced metadata format“. You should definitely think twice before turning this feature on:
With the march release came function “Store datasets in enhanced metadata format”. With this feature turned on, Power BI data models will be stored in the same format than Analysis Services Tabular models. This means that they inherit the same amazing options, that this open-platform connectivity enables.
Limitations and their consequences
But with the current setup, you could end up with a non-working file which you would have to build up from scratch for many parts. So make sure to fully read the documentation . Now!
In there you find this warning:
Warning for the new enhanced metadata format Read more