How can we improve Power BI?

Tables in PDF files

I have come across so many public data repositories that hold data in PDF format. Other websites have tables within documents such as annual reports etc., also in PDF format. A data source for PDFs or tables from PDFs would be awesome!

3,012 votes
Sign in
Check!
(thinking…)
Reset
or sign in with
  • facebook
  • google
    Password icon
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Gogula Aryalingam shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

    269 comments

    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      Signed in as (Sign out)
      Submitting...
      • Kalle commented  ·   ·  Flag as inappropriate

        This feature (adding data from PDF's) would be very useful, saving a great deal of input time.

      • Linda McLaren commented  ·   ·  Flag as inappropriate

        The ability to retrieve data from pdf files, for use in Excel, would be an excellent additional utility.

      • Terry B commented  ·   ·  Flag as inappropriate

        Getting data from a PDF would be a great help to most everyone I would guess. Certainly me.

      • Anonymous commented  ·   ·  Flag as inappropriate

        Any updates on this it important to enable pdf export option in table shorter visual

      • kigayo083 commented  ·   ·  Flag as inappropriate

        I routinely import PDF data on Excel's PowerQuery and are working on data editing. From my work experience, the easiest way to edit data is to prepare a file converted to HTML with Adobe Acrobat DC's export function and import it via the web connection route. The problem is that it is difficult to automate editing after import. After that, it will be explained assuming editing PDF table. I do not know clearly the trigger, but because the fields are joined and split, and the data decomposition of the join columns is involved. Furthermore, the pattern of joining and splitting field columns also varies. While looking at the state, you will be disassembling. This is presumed to depend on how the table conversion function of PDF before importing is specified. We think that it is necessary to request Adobe to output table data without unintended coupling / division which is different from display.

      • Anonymous commented  ·   ·  Flag as inappropriate

        This would be a great feature indeed, the amount of data contained in these not straight readable format is huge, and many times the contained data is pretty well structured!.

        Your Office 'Word' program already can do it, apparently. It opens pdf documents that have been produced with different pdf software, just by back applying the pdf standard.

        This facilitates a workaround by the way:

        - Download the pdf
        - Open it with MS Word (not all files are readable, e.g. optically scanned/printed as image docs, etc.)
        - Copy table and paste it in excel
        - Re-shape appropriately and point the query to the excel

        No need to say that refreshment is kind of manual and painful!... I guess that large tables that aren't updated too often it may worth the work, though

        Unfortunately some database holders are jealous about you querying their data without using their user interface and are reluctant to offer plane csv or other straight readable format. This is typically done by governmental 'open data' websites!

        Looking forward to have this in place :)
        You rock PBI guys!

      Feedback and Knowledge Base

      Ready to get started?

      Try new features of Power BI today by signing up and learn more about our powerful suite of apps.