web
You’re offline. This is a read only version of the page.
close
Skip to main content

Notifications

Announcements

Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Power Automate / PDF extraction Flow no...
Power Automate
Answered

PDF extraction Flow no longer works

(1) ShareShare
ReportReport
Posted on by 2,407 Super User 2024 Season 1

I have been using a Flow to split multipage PDFs into single pages for over a year. Since the update a few weeks ago, it started to be unreliable. The PDF contains multiple invoices, I use the Flow to create single page invoices which then are uploaded to a SP document library and are run through RPA to extract data. Now most of the time the Flow only extracts a handful of the PDFs. There are often 50-70 pages, but the extraction only pulls 12-15 pages out. I may be using the wrong method but here is the relevant part of the Flow:

BrianS_0-1706798670995.png

 

I added a wait after the extract after the update 3 weeks ago, and it seemed to solve the problem. After teh update this morning, it is once again only pulling out a portion of the individual PDF pages.

Is there a better way to accoplish this?

I have the same question (0)
  • Gidi Profile Picture
    601 on at

    Hi BrianS, 

    can you show the error handling that you use for the Extract action?

     

  • BrianS Profile Picture
    2,407 Super User 2024 Season 1 on at

    I really only set up the "out of Bounds" so when it reached the last page it would go to the next step

    BrianS_0-1706891617776.png

     

  • Agnius Bartninkas Profile Picture
    Most Valuable Professional on at

    That's not what you did. You applied a rule to go to the end of the loop whenever any error occurs. And then you have a special rule to go to the next action when "Page out of bounds" error occurs. You need to switch around the steps to be taken when the exceptions happen.

     

    Also, is the directory where you're saving the extracted files local to your machine, or is it a network directory, like OneDrive or something like that?

  • BrianS Profile Picture
    2,407 Super User 2024 Season 1 on at

    I'm saving to a synced Sharepoint library.  I'm not sure I know what you mean by switching around the steps.

  • WillSG Profile Picture
    352 Moderator on at

    Hi @BrianS  I hope you are doing well.

     

    Here is a quick turnaround that you can use to overcome this issue (see image below):

     

    1. You create a list to save all those page numbers that have not being saved into the folder
    2. To get those numbers when the Bot fails, you need to set up the Error Handler of the action Extract PDF to create a new Variable and then continue to the next action.
    3. Here the If statement action will catch the variable with a Yes and will add the exact page number that was not created.
    4. Also, in the image action #7 is optional, but you can see exactly what happened during the process and what was the issue in that interaction. But you can save it, record it in a document or even create another Issue list or send the issue to your email.
    5. Finally, you can create a few more steps into the automation to handle the list, if a list has items you can use a for each action and use the Extract PDF Pages action to handle and create the pages out of the List.

     

    So, my recommendation is to use the Handling Errors options in this way, you never have to use an error to get out of a loop.

     

    Also, if the file is in One-Drive or any other directory, this could affect the Bot, it won’t happen all the time, but I have seen Excel files that cannot be processed OR saved because One-Drive.

     

    So again, my recommendation in those cases is to move the file first to a local directory, perform the PDF Extraction and finally move all the files extraccted to the correct and final folder. In this way you don’t have to deal with ONE-DRIVE synchronizing all the time for each file.

     

    PS: on my test I was extracting pages of an 11 pages PDF doc, hence the list is showing 9 errors from 12 to 20, all of them with the error Pages out of bounds. Also, I have tested the action creating more than 200 PDF new pages and it didn't fail.

     

    Hope this helps,


    If I have addressed your inquiry successfully, kindly consider marking my response as the preferred solution. If you found my assistance helpful, a 'Thumbs Up' would be greatly appreciated.

     

    Additionally, I offer specialized consultancy and development services leveraging PAD. If you're interested in exploring these services further, feel free to DM me, and we can initiate a discussion.

     

    Kind regards,

     

    Will SG

    Managing Director & Automation Lead

    RAMS CR (Recruitment & Automation)

    LinkedIn Profile

     

    WillSG_Screenshot 2024-02-05 232123.png

  • Agnius Bartninkas Profile Picture
    Most Valuable Professional on at

    The way you have set up your rules in here will result in the flow going to the next action if the 'Page out of bounds' error occurs, but going to the end of loop if any error occurs. From what you mentioned previously, I assume your plan was to do the opposite. 

     

    What this means is that you should rebuild the rules in the exact opposite way:

    Agnius_0-1707285947638.png

     

    I.e.: On "All errors":

    1. Continue flow run
    2. Go to next action

    And on "Page out of bounds" error:

    1. Continuge flow run
    2. Go to label EndOfLoop

     

    On the other hand, the reason why your flow might be failing sometimes is because you are trying to write those files to a directory that is auto-synced. I would recommend saving them locally first and then using Move file(s) to move them to the SharePoint library.

  • BrianS Profile Picture
    2,407 Super User 2024 Season 1 on at

    I am away from the office today so I will try these ideas tomorrow. I originally had the files being written to a local folder, that was when the failures started, but I'll try all this and report back

  • BrianS Profile Picture
    2,407 Super User 2024 Season 1 on at

    Will - this looks a little more elaborate than I think I need, but I could see the use. I am having a problem - I do not see how to set the "LastError" variable. I do not see any error data available in the PDF Extraction step. I'll be the first to admit I'm no expert here.

  • Verified answer
    WillSG Profile Picture
    352 Moderator on at

    Hi @BrianS  I hope you are doing well.

     

    Yes, totally agree with you, it is a little more elaborate, but it is the smarter way to use the error handler and how to overcome situations like this one.

     

    Now, here below is an image that showcases how to set up the Error Handler of the action Extract PDF.

    1. First you need to create a new Rule
    2. In this rule you will handle All Errors
    3. The variable can be like “PdfExtractionError”
    4. The value set up as YES
    5. Then in the IF statement, you are validating if the variable % PdfExtractionError % contains YES
    6. If YES, that means there was an error
    7. If not, keep looping

     

    Here is the code snipped:

     

    Variables.CreateNewList List=> List

    LOOP LoopIndex FROM 1 TO 20 STEP 1

        Pdf.ExtractPages PDFFile: $'''D 2024.pdf''' PageSelection: LoopIndex ExtractedPDFPath: $'''D:\\User\\wsanchez\\Course RPA\\PDF Multiple Pages\\PAD Comunnity Responses 2024_%LoopIndex%.pdf''' IfFileExists: Pdf.IfFileExists.Overwrite ExtractedPDFFile=> ExtractedPDF

        ON ERROR

            SET PDFExtractionError TO $'''YES'''

        END

        IF Contains(PDFExtractionError, $'''YES''', True) THEN

            ERROR => LastError Reset: True

            SET ErrorToBeAdded TO $'''%LoopIndex%-%LastError.Message%'''

            Variables.AddItemToList Item: ErrorToBeAdded List: List

            SET PDFExtractionError TO $'''NO'''

        END

    END

    Display.ShowMessageDialog.ShowMessage Message: List Icon: Display.Icon.None Buttons: Display.Buttons.OK DefaultButton: Display.DefaultButton.Button1 IsTopMost: True ButtonPressed=> ButtonPressed

     

     

    Hope this helps to clarify any doubt,


    If I have addressed your inquiry successfully, kindly consider marking my response as the preferred solution. If you found my assistance helpful, a 'Thumbs Up' would be greatly appreciated.

     

    Additionally, if you have any questions, feel free to DM me, and we can initiate a discussion.

     

    Kind regards,

     

    Will SG

    Managing Director & Automation Lead

    RAMS CR (Recruitment & Automation)

    LinkedIn Profile

     

    WillSG_Screenshot 2024-02-09 092759_2.png

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Forum hierarchy changes are complete!

In our never-ending quest to improve we are simplifying the forum hierarchy…

Ajay Kumar Gannamaneni – Community Spotlight

We are honored to recognize Ajay Kumar Gannamaneni as our Community Spotlight for December…

Leaderboard > Power Automate

#1
Michael E. Gernaey Profile Picture

Michael E. Gernaey 501 Super User 2025 Season 2

#2
Tomac Profile Picture

Tomac 323 Moderator

#3
abm abm Profile Picture

abm abm 237 Most Valuable Professional

Last 30 days Overall leaderboard