Avoid redundancy of datasets (same source, 2 data sets)
Each time you pull/publish a PBIX file, a dataset is created with that PBIX file name. If you create 3 PBIX files all using the same Tabular model you will have 3 datasets in the Service, not the 1 datasource of the Tabular model. This is a current problem in my opinion.
Any enterprise level solution that requires version control or backups requires the use of PBIX files and are the only way to accomplish that currently. The assumption also being that a large scale deployment would want to include the use of Tabular models (and now also multidimensional).
Ideal behavior, in my opinion, would be that I deploy the reports (PBIX files) and PBI would understand that it's the same SSAS Tabular model in direct connect mode and not create another dataset, but just use the single Tabular model connection.
jerry deng commented
Our way of re-using one golden dataset across different apps, as I commented on one of Matt Arlington's blogs:
In my scenario I am making this Golden Dataset available to use in multiple App Workspaces, to maximize the application of “single source of truth”. This is done by using Sharepoint to house the Golden Dataset (PBIX), then in each of the App Workspaces use Get Data –> Get Files –> Sharepoint Teamsites, then point to where the PBIX file is uploaded.
The setup is pretty straightforward, using Sharepoint as the cloud storage for PBIX files, and by using the "Get Files from Sharepoint" method above, in each App Workspace, the Golden Dataset is now available for use in each App Workspace, and subsequently production Apps. Future updates in the Golden Dataset (data refresh, measures, relationships) will reflect to each App Workspace.
Of course this doesn't solve the issue of connecting to multiple datasets at the same time from same report, so that remains on the wishlist.
Jonathan Hunter commented
Hope MS sees this one as no one really seems to be talking about it. This is a major problem that results in duplicate datasets, of which can't be shared between workspaces. I suppose though when you bill by storage usage this is by design! = P (Please prove me wrong MS).
Today we have data refreshes that are pulling the same data as other reports. We also have datasets that are basically the same data being scheduled for refresh even though they are near duplicate datasets.
Collaborating from a shared dataset is very difficult because you can only have one Power BI Dataset loaded at a time.
I think we'd all like to be able to define "Master Datasets" that everyone can use from, that have scheduled refreshes, however at present time it looks like you can only pick to work with one dataset at a time through Power BI Desktop or Cloud and also they are not sharable between workspaces, which is a problem.
This is an annoying problem, but nobody seems to be looking at this. Each time one deploys a report from PowerBI Desktop it creates a new dataset even though they have identical Tabular sources.