Built in Git support in PowerBI Desktop
When you create a PowerBI report in Power BI Desktop it transparently create \ use a git repository. So you just worked normally, then when you hit publish, it actually push to remote.
Gitignore could be used to exclude data from being published to the repository. PowerBI would have version control which is desperately needed.
This would work well with the whole Microsoft buying Github \ Azure DevOps, both from the perspective of promoting good practices such as code version control and CI\CD - process of developing in development and promoting to UAT\production
It would also allow IT to get visibility of code, allow tests to be run - such as moving DAX calculated columns with M
This would also solve the other ideas:
Douglas Plumley commented
I am approaching this issue from the context of a long time user of SSRS. I understand that Power BI was intended for the business audience. I am opinion that the source files of Power BI should be readable (XML/JSON) similar to the RDL format of SSRS. This allows for inspection of changes (code review, differencing) and makes source control a viable option.
Definitely - keep the business user in mind. But, remember that mission critical reports would need to go through a review process and this would be very difficult without having the ability to do a due diligence, a.k.a source control/code review/pull request.
1)Open and readable format (XML, JSON)
2)Easy publishing throuhg Azure Dev Ops
3)Visual Studio project template which allows integration with Git (similar to SSRS)
Like many others I have created a semi-automatic process that extracts the content of the .pbit file (and some of its included nested archives) into a git readable structure. It's a good first step, but a native integration would be so much easier. Every time the .pbix gets uploaded to the workspace the .pbit contents should be pushed into git.
Nathan Baird commented
My team does version control of pbix in GIT today. We have a git-commit hook that detects when a pbix file is being added, it automatically extracts the contents from the pbix (which is a zip), pretty prints some of the files (particularly nested json strings), and automatically adds the auto-generated files as part of the commit. Much of how powerbi stores things internally is still not really comprehendible at Code Review time (eg. rather than using enums with friendly string names, they use ints to identify certain operations).
In addition to providing something like the above (rather than us having to build it), it would still be nice to have the publishing connected to sharepoint work automatically. Sharepoint APIs are not great - simply moving a 'local' file in git to sharepoint with the APIs + Auth is not straight forward - so we've been unable to set up an azure dev ops pipeline to publish when commit is merged to a release branch. Would love to see a Azure Dev Ops pipeline task that would simply publish pbix files if they make it to master.
John Bonfardeci II commented
You can't version control Power BI files with Git because they contain binary data. But you can use SharePoint document libraries to store different versions.
Yes please. Version control is a critical feature, and integrating it into Power BI will greatly increase its usability. I've just started implementing PBI reporting with my business, and will definitely need this functionality sooner rather than later.
In our shop, we are actually using TFS for version control within a project that contains datasets, RDLs, data sources, PowerBI reports, and SAS code. We import the PBIX file into the project, and then from there, simply check out the file and open from TFS in order to capture the versions.
Josef Hoenzsch commented
This would be a HUGE update for enterprise customers. I know so many teams who choose not to use Power BI because they can't easily diff changes made to their reports. It's maintainable.
Dan Shryock commented
I feel like implementing 2 simple features would solve almost all of these issues:
1. Convert a PBIX file to a PBIT and save it to a folder instead of a zip file. This would make it easy to version and compare changes manually using the desktop tool.
2. Do the above conversion on the command line for the desktop tool. This would make automating the version control process much simpler. We could then use PBIDesktop.exe for either the PBIRS or the Power BI Service to open a report, export the template as mostly plain text json files, and commit them to our version control system of choice.
Aaron C commented
My idea is a "Team explorer" version control window in PowerBI
This would require the ability to export the report (no data, no Power Query*) from the Power BI file as it's own git friendly file type. This would then allow full version control.
It may also require a "dataset" included template. Which would be a everything in the current template method, but git friendly format.
*like a PowerBi report that connects to a data set in PowerBi service.
Absolutely vital. We are currently using git (gitea) to store the binary files, super lame that we have no visibility into the complexities of the AAS Model, DAX, json templates, and M Language etc... Power BI is a beast and we need to be able to diff.
Seriously thinking about building a script that automatically separately breaks apart the zip so that git can get a diff. Have to tread carefully because unzip the file corrupts the pbix file in my experience, so will just break it out separately. Maybe do a filewatcher that looks for new modified date on the pbix file.
Power BI put on Blast by SQL Server Central :)
"... not considering the need to provide some text format for the reports, I can't understand why they didn't learn from the Integration Services team and realize that binary versions of programmable items don't make sense. We need a format that can be easily versioned, and maybe more importantly, stored in a VCS and diff'd in a way humans can understand.
When building a format for storing code, please consider the need to work in a team and version the changes made. This means any format should not be a) binary and b) difficult to decode. Separate visual elements from logical elements and ensure a text version of this can be examined by developers. You can use XML, JSON, YAML, or any other text based format, but choose something that makes sense. Even if you add your own extension, ensure that standard tools that work with code can use this.
I do know the PBIX format is a ZIP file, but zip files don't easily integrate with a VCS. We could use hooks to extract/rebuild out files on commit/checkout, but that's cumbersome and silly. I'd rather that the PBIX were a folder with the files inside. Users, including my Mom, can zip a folder and email that if needed. To me, that would have been a better structure from the beginning.
Microsoft is supposed to be a company providing platforms that we build upon and use in our work. The decisions for the Power BI service seem to be poorly thought through with that in mind. I'd urge them to create a baseline set of rules for future products that consider DevOps, teams, and the need to track code."
Simon Hobman commented
At the very least, the DAX, M scripts and ideally the models should be able to be stored in GIT. We have a complex model and complex DAX scripts, and we have a need to share that code within various PBIX files. As of now we do all of this manually, but integration directly into Power BI Desktop would a huge help.
Jonathan Wilson commented
I'd suggest a publish template button for the Git integration. This way everything except the actual data will publish, maybe with the option to remove even the small amount of data resident in template files.
Alexis Olson commented
This would be amazing. Currently, diffing measures requires extracting and saving them using a 3rd party tool like Power BI Helper. Being able to natively diff PBIT files would be a major step forward.
Bob L commented
Fantastic write up Matt. Totally agree with the github angle too - it makes too much sense not to pursue.
Matt Smith commented
Just to expand on the "why version control \ git" from newbie point of view.
Version control is essential for any developer regardless of level. Version control allows you to track changes. It's highly unlikely you are going to ever develop a perfect solution, even if you are a expert, first time that isn't going to require any changes. By tracking changes it allows you to quickly identify WHEN something has changed, WHAT has changed - as you can "diff" the old vs the new file and if you have added a useful commit message, WHY something has changed.
Git is a distributed version control system. This means there is no spoon. I mean there is no server. Every copy has everything and is equal. Because of this commits are performed locally and you "push" them to remote servers, this means you can develop offline then push when you are back online. Pushing basically means you have a backup and allows others to "clone" your work (permissions willing). Git has become the defacto code version control.
The current problem is Power BI Desktop creates PBIX or PBIT which are .zip binary files. This means you can't "diff" the files, so you can't track changes - you can't see that you've changed the [Price] measure to [Price - Exc VAT]. In addition because you can't "diff", you can't merge changes - which you need to do if you have more then one developer working on a project. So if Dev-A changes [Price] to [Price - Exc VAT] and Dev-B adds [Price - Inc VAT] you can't work independently then merge the two together. This limits Power BI development to a single developer.
If this were to be implemented it would truly raise the level of Power BI development. This combined with the upcoming XMLA endpoint exposure would make proper collaborative enterprise level work possible.