web
You’re offline. This is a read only version of the page.
close
Skip to main content
Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Power Automate / Crawl a webpage NOT us...
Power Automate
Unanswered

Crawl a webpage NOT using Flow Desktop

(0) ShareShare
ReportReport
Posted on by

Hi

 

Since I don't have a gateway, I was wondering if there is any alternative to crawl the content of a page using Flow, but not the Desktop version. Essentially I need to extract some text from a webpage and then format the text to put it into an excel.

 

Thanks

 

Ezequiel

Categories:
I have the same question (0)
  • Verified answer
    Paulie78 Profile Picture
    8,422 Moderator on at
    Re: Crawl a webpage NOT using Flow Desktop

    Hi @ezequiel ,

     

    Yes, you can do it with the HTTP action (which is a premium action). You can then parse the HTML to get the data that you want. I actually did a video on it here:

     

    https://www.youtube.com/watch?v=EfnBlg0ip3A

     

    See how you get on and let me know if you need any help.

     

    Blog: tachytelic.net

    YouTube: https://www.youtube.com/c/PaulieM/videos

    If I answered your question, please accept it as a solution 😘

  • Verified answer
    ezequiel Profile Picture
    on at
    Re: Crawl a webpage NOT using Flow Desktop

    Thanks @Paulie78 . The solution looks to be what is needed. The only challenge I that the page is under  https://microsoft.sharepoint.com/teams site that requires credentials. The flow is not taking my credentials by default. Do you know if this is solvable having the flow using my own credentials? Thanks.

  • Paulie78 Profile Picture
    8,422 Moderator on at
    Re: Crawl a webpage NOT using Flow Desktop

    It is possible, but not easy. What is the data that you're trying to get to, is it not available with an API action?

  • ezequiel Profile Picture
    on at
    Re: Crawl a webpage NOT using Flow Desktop

    It is not...it is under a https://microsoft.sharepoint.com/teams/ site. That is the tricky part. When accessing the page directly in a browser it reads my credentials and goes through, but the flow is failing 

     

    ezequiel_0-1630081633685.png

     

     

     

  • Paulie78 Profile Picture
    8,422 Moderator on at
    Re: Crawl a webpage NOT using Flow Desktop

    So you could probably access that without any authentication by using the "Send a HTTP Request to Sharepoint" action instead. Not sure, but it might work.

  • ezequiel Profile Picture
    on at
    Re: Crawl a webpage NOT using Flow Desktop

    Thanks for the suggestion. There is a "Uri" variable...what is exactly this and how can I get it? Thanks!

  • eliotcole Profile Picture
    4,363 Moderator on at
    Re: Crawl a webpage NOT using Flow Desktop

    @ezequiel it might be possible to get the data that you want from the graph API. Have you attempted to sift through some of that?

  • ezequiel Profile Picture
    on at
    Re: Crawl a webpage NOT using Flow Desktop

    Hi @eliotcole ...I thought a simpler approach. Is it possible to "Save As" the sharepoint page? I can manipulate the strings and HTML within the page after that to extract the data I need . I couldn't find any connector that allows me to do that...is there any? Thanks

  • eliotcole Profile Picture
    4,363 Moderator on at
    Re: Crawl a webpage NOT using Flow Desktop

    Actually, it is ... so if you set a flow to run on when a file is created or modified, and point it (for example) to a SharePoint blog site pages library. Then you can later on in the flow call the item and get the 'CanvasContent1' content. In the below URI I've used item #2 ... but you can make that dynamic:

    _api/web/lists/GetByTitle('Site%20Pages')/items(2)/CanvasContent1
    By definition it should only grab everything in the canvasContent1 section of the page, this may need to be broadened if that's not the only ''canvas'' that is modified ... but it's a logical starting spot.
     
    I won't go too much into it here, because I've not really looked at this one for a while, but it's from one of my old flows that I was essentially starting my own little wordpress version within SP. 😉
     
    If the site wasn't a microsoft one you could also find out how to get a javascript login, then I might have ways around the premium connector part ... but you're Microsoft, I don't want you telling the boss!

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Forum hierarchy changes are complete!

In our never-ending quest to improve we are simplifying the forum hierarchy…

Ajay Kumar Gannamaneni – Community Spotlight

We are honored to recognize Ajay Kumar Gannamaneni as our Community Spotlight for December…

Leaderboard > Power Automate

#1
Tomac Profile Picture

Tomac 497 Moderator

#2
Michael E. Gernaey Profile Picture

Michael E. Gernaey 477 Super User 2025 Season 2

#3
chiaraalina Profile Picture

chiaraalina 242

Last 30 days Overall leaderboard