Tables in PDF files
I have come across so many public data repositories that hold data in PDF format. Other websites have tables within documents such as annual reports etc., also in PDF format. A data source for PDFs or tables from PDFs would be awesome!
Thanks everyone for your feedback/votes!
We’re actively working on this new connector. You can find an early demo in the recording of the Power Query session at the Microsoft Business Applications Summit: https://www.microsoft.com/en-us/businessapplicationssummit/video/BAS2018-2167
Please stay tuned to the Power BI blog for further announcements.
Thus will be so amazing.
yes, that would be very useful
This feature (adding data from PDF's) would be very useful, saving a great deal of input time.
Fantastic, would make live so much easier
Apie Heunes commented
That will be an awesome utility. Please... Please... Please
That would be fantastic
Would be a great help
Linda McLaren commented
The ability to retrieve data from pdf files, for use in Excel, would be an excellent additional utility.
Terry B commented
Getting data from a PDF would be a great help to most everyone I would guess. Certainly me.
That would be absolutely fantastic
Hazel McLaren commented
This would save me so much time ... please please make it happen Microsoft
Great Idea, can't wait
Stan Bond commented
By adding the ability to directly take data from a PDF file into excel
Any updates on this it important to enable pdf export option in table shorter visual
I routinely import PDF data on Excel's PowerQuery and are working on data editing. From my work experience, the easiest way to edit data is to prepare a file converted to HTML with Adobe Acrobat DC's export function and import it via the web connection route. The problem is that it is difficult to automate editing after import. After that, it will be explained assuming editing PDF table. I do not know clearly the trigger, but because the fields are joined and split, and the data decomposition of the join columns is involved. Furthermore, the pattern of joining and splitting field columns also varies. While looking at the state, you will be disassembling. This is presumed to depend on how the table conversion function of PDF before importing is specified. We think that it is necessary to request Adobe to output table data without unintended coupling / division which is different from display.
This would be a great feature indeed, the amount of data contained in these not straight readable format is huge, and many times the contained data is pretty well structured!.
Your Office 'Word' program already can do it, apparently. It opens pdf documents that have been produced with different pdf software, just by back applying the pdf standard.
This facilitates a workaround by the way:
- Download the pdf
- Open it with MS Word (not all files are readable, e.g. optically scanned/printed as image docs, etc.)
- Copy table and paste it in excel
- Re-shape appropriately and point the query to the excel
No need to say that refreshment is kind of manual and painful!... I guess that large tables that aren't updated too often it may worth the work, though
Unfortunately some database holders are jealous about you querying their data without using their user interface and are reluctant to offer plane csv or other straight readable format. This is typically done by governmental 'open data' websites!
Looking forward to have this in place :)
You rock PBI guys!
Takahiko Doi commented
Finally Tableau support this capability.
Please add this ASAP!!
Basic & very important feature - please add it ASAP