Skip to main content
Microsoft Idea

Power BI

New

Make better use of caching to speedup development of models

Vote (2) Share
Pascal Bellerose's profile image

Pascal Bellerose on 15 Oct 2021 14:40:44

This is coming from past experiences using RStudio and practically any other programming languages.

Make it so that when I read the data source on the first step, it will cache the data read and use this cache for future steps instead of downloading a new preview everytime.

For example, if I'm setting the data profiling on the full dataset, it should keep a cached copy of the dataset in memory to make it faster to process.

That's how I would do it in RStudio (any other R programming interface).
I would load the source on the first line, then transform it by executing any of the steps in the script.

example:
dset <= read_csv("c:\myfile.csv")
head(dset,10)

This will store the data read in "myfile.csv" into a data.table object named "dset"
I can later refer to this object and view a preview of its contents by using "head" function.
I can even tell it how many records I want in the preview.

This is very effective even when reading large files (10M lines).

I think the whole issue is coming from the fact that PowerQuery M uses an iterative way of making changes to data, therefore it has to read from source and apply all steps again every time I click on a step and that's what making it so damn slow.

This is why I'm taking the time to put in an idea.