web
You’re offline. This is a read only version of the page.
close
Skip to main content

Notifications

Announcements

Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Power Automate / Using Regex to extract...
Power Automate
Unanswered

Using Regex to extract values from a web site

(0) ShareShare
ReportReport
Posted on by 15

Hello,

 

I try to extract data from a web seite with Power Automate Desktop. As the CSS Selektor is not unique I want to use regular expressions. I'm used to create them (coded much Perl in ma past) - but I'm not sure how to apply them here. 

And I just searched the word "Unternehmenssitz" which is the word I am looking for as the headline (the text I want follows in the next tag). When I search for (Unternehmenssitz) everything is fine. But as soon as I add something to it, the Regex retruns nothing. Even when just adding the < of the closing tag. 
I tried both

  (Unternehmenssitz)< 

and the escaped version 

  (Unternehmenssitz)\< 
and with a \s* in between:

  (Unternehmenssitz)\s*< 
  (Unternehmenssitz)\s*\< 
All four return nothing.


The final regex should be something like this: (And should return "Halbergmoos, Germany" in the example below.)

Unternehmenssitz</td><dd[^>]+>([^<]+)<\/dd>


Here is a screenshot of the web page code (from the developer tool):

 

code.png

 

Thanks for your help!
Jan

I have the same question (0)
  • JanTheofel Profile Picture
    15 on at

    After thinking about it I guess the attribute setting might be the problem. It is set to "Own text" which probably just give access to the text of the webpage but not to it's code. But I can't find a reference for possible values and leaving it empty does not solve the issue.

  • Pavel_NaNoi Profile Picture
    1,074 on at

    I'm not 100% sure as to what you're doing, but I did a bit of testing and I think it might work:

    I'm using an UI Extract data from window action and i'm pointing it on to the pane, any part of the text will do (if you want it to automatically open this window up before extracting the text, just send an f12 hotkey) then I simply make it generic by making it like so:

    Pavel_NaNoi_0-1624545353854.png

    this will extract all the text in there, from there you can simply use this regex inside a parse text command onto the variable that stored all the text, to find the text you need:

    Unternehmenssitz(.)*(\s|\n)?(.)* which will extract everything after the word for one line, you can keep on parsing from there to further filter the text to the words you want.

     

    I think this is what you kind of wanted,

    hope it helps!

  • tkuehara Profile Picture
    667 on at

    Hi,

     

    You could try editing your CSS selector. If you are looking for a fixed value - in your case, "Unternehmenssitz" - then you could setup your CSS selector as follows:

     

    dt:contains(Unternehmenssitz) + dd

    The selector above could be interpreted like this: get a dt element that has the string "Unternehmenssitz" and then retrieve its adjacent sibling element dd (through the "+" selector). This way you get the next tag immediatly after the tag with the "Unternehmenssitz" text.

    tkuehara_0-1624968890855.png

     

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Forum hierarchy changes are complete!

In our never-ending quest to improve we are simplifying the forum hierarchy…

Ajay Kumar Gannamaneni – Community Spotlight

We are honored to recognize Ajay Kumar Gannamaneni as our Community Spotlight for December…

Leaderboard > Power Automate

#1
Michael E. Gernaey Profile Picture

Michael E. Gernaey 522 Super User 2025 Season 2

#2
Tomac Profile Picture

Tomac 364 Moderator

#3
abm abm Profile Picture

abm abm 243 Most Valuable Professional

Last 30 days Overall leaderboard