How can we improve Power BI?

Tables in PDF files

I have come across so many public data repositories that hold data in PDF format. Other websites have tables within documents such as annual reports etc., also in PDF format. A data source for PDFs or tables from PDFs would be awesome!

3,075 votes
Sign in
(thinking…)
Sign in with: facebook google
Signed in as (Sign out)

We’ll send you updates on this idea

Gogula Aryalingam shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

280 comments

Sign in
(thinking…)
Sign in with: facebook google
Signed in as (Sign out)
Submitting...
  • Mike Honey commented  ·   ·  Flag as inappropriate

    A heads-up that while this feature kind-of worked in the November 2018 release, it was totally wrecked in the December 2018 update - the table and row detection broke to the point it was unusable.

    The good news is the February 2019 update has fixed those problems and improved the detection of similar tables on subsequent pages - they are combined into one table. An odd new "feature" is that after detecting a table they leave you to manually add a Promoted Headers step - ideally that would be generated.

    I've worked on many projects with multiple similar technologies over the years (Tabula was my previous fav). It's always a little messy due to the limitations of the PDF format. But I feel more confident tackling them in Power Query, with all it's glorious data transformation power at my fingertips.

    Thanks Amanda and team!

  • Anonymous commented  ·   ·  Flag as inappropriate

    We're looking for sharing features to be available to non-designers. That report consumers/viewers can use them when embedded, regardless of embed methods.

    This feature would be most useful (for us) when this can be applicable outside the power bi app and the power bi service.

    In the same light, export to PDF is most useful when its embedded, ie beyond Power BI app and beyond Power BI service.

  • DirkV commented  ·   ·  Flag as inappropriate

    When will this be available in Get & Transform in Excel?
    Urgently needed.

  • Mattia Russo commented  ·   ·  Flag as inappropriate

    Hello Everybody,

    the pdf connector works only for PBI Desktop. When i try to use a Gateway on a dataset that use a pdf Connector the Gateway doesn't work!!!! Are you working on it? When will fix this bug?
    thanks in advanced!
    Mattia

  • Tiffany commented  ·   ·  Flag as inappropriate

    This is great but I've come across where the data becomes corrupted and produces errors in the editor.

    My source is a folder and I have 2 PDFs sent to me daily that I drop in that folder. The PDFs are identical except for the dollar amounts. Inconsistently; BI will corrupt one or a few of the documents when I refresh the dashboard.

  • DirkV commented  ·   ·  Flag as inappropriate

    Feature is working fine in my applications.
    When will this be shipped with Excel Get & Transform since I prepare my data in Excel and I have to log the data imported?

  • Terry commented  ·   ·  Flag as inappropriate

    It would be good if this could also read the data from formatted fields within the PDF. I believe they maybe in a fdf format. But as they are named fields it should be relatively easy to show a list of fields by column. And let you import like other files. Currently the data is not imported at all

  • Sharon Maxon commented  ·   ·  Flag as inappropriate

    Beyond just importing a chart from a PDF, we need to be able to import a chart in a collection of PDFS with a consistent format in a folder. For example, a report in a standardized format is received on a weekly basis. We need to be able save the PDFs for a SharePoint Online folder and then let Power BI find each chart to append them together. This is a powerful feature that work for multiple Excel files in a folder, so replicate the same with PDFs.

  • Niko Suomi commented  ·   ·  Flag as inappropriate

    Can this read hand-written tables, if those are scanned and then saved as pdf-file?

  • KrisW commented  ·   ·  Flag as inappropriate

    SEPT 2018 UPDATE: I am testing the PDF Import/Connector & have already found minor issues. Who/How/Where do I report?


    IN BRIEF - I have "sample data" G/L Ledger 51 pages. PBI is not bringing in column headers which is not a big deal, but in skipping the headers it is merging any data where there is only 1 space between columns. EXAMPLE: PERIOD & SOURCE of 1 PJ became 1PJ & ACCOUNT_NUM & ACCOUNT_DESC of 21200 TRADE COLLECTORS became 21200TRADE COLLECTORS - these 2 are easy enough to "split columns" to fix.

    HOWEVER, AMOUNT & DESCRIPTION were also merged so instead of -409.09 Pre-conversion purchase, I have -409.09Pre-conversion purchase. There is not a decimal in every amount & the amount total digits can vary. While it is highly unlikely that our company will connect to PDFs on a regular basis, we feel that this is an important feature for PBI.

    Our own software has no problems with this sample file. Tableau merges fields same as PBI, but at least it leaves the space so that the fields can be "split"

  • Sam commented  ·   ·  Flag as inappropriate

    @Miguel
    The summit is over - and It is Mid 2018 - when in the PDF connector scheduled

  • Randy commented  ·   ·  Flag as inappropriate

    This is obviously a much needed data source, any update on its release will be appreciated

← Previous 1 3 4 5 13 14

Feedback and Knowledge Base

Ready to get started?

Try new features of Power BI today by signing up and learn more about our powerful suite of apps.