Hello everyone
I am new to Power Automate and I am trying to scrape information from a list of websites. I have been using Extract data from web page action and it has been working, but for this particular website (https://www.cnsf.com.pt/divulgacao) I was able to get the text of each point of each dropdown menu but I am not having success extracting the Href from each one as I did for the text. It doesn't show me any Href option as it did in other websites, even when selecting different sections. I just want to select each one and extract the Href to a Datatable.
In the pictures I show the items where I want to extract the Href from and what I see when I try to do it.
If anyone knows how to help for this specific case I would really appreciate it. Thanks in advance.
Thank you so much Agnius, you have been amazing in helping, really well explained to someone who is just starting.
Once again thank you, you are a lifesaver 🙏
You need to use the Live web helper to extract all the values into a list first like this:
When you've done that, and click on Advanced settings, you'll see it is being extracted as a list:
You then need to modify the CSS selector appropriately by either deleting the extra element, or adding it, depending on what you captured.
I actually captured the paragraph parent (<p>) as it seemed to be easier to do. The parent still does not have any href, but it also returns an empty CSS selector field. If you captured the child <div> you might see something like this in the CSS selector field:
But the result you want is actually only the <a> element in the CSS selector field, and "href" as the attribute you want to extract:
Here's a copy of the action that you can simply paste into PAD to have it created for you:
WebAutomation.ExtractData.ExtractList BrowserInstance: Browser Control: $'''html > body > div:eq(0) > div > div:eq(1) > div > main > div:eq(3) > div:eq(0) > div:eq(1) > div:eq(1) > div:eq(1) > div > div > div > div > div:eq(0) > p''' ExtractionParameters: {[$'''a''', $'''Href''', $''''''] } PostProcessData: False TimeoutInSeconds: 60 ExtractedData=> DataFromWebPage
-------------------------------------------------------------------------
If I have answered your question, please mark it as the preferred solution. If you like my response, please give it a Thumbs Up.
I also provide paid consultancy and development services using Power Automate. If you're interested, DM me and we can discuss it.
I did make it work but only for the first menu, I am having trouble with the second and the remaining ones since I'm not able to find an element to get the HRef. It works as I said before by going to Advanced Options and manually changing the CSS selector to the corresponding element, but it still doesn't allow me to select more than one. Once I select a second instance using that same method the first one is overridden by it. You have been extremely helpful so far, thank you so much once again, but are you able to help me with this problem?
Cool. Glad to hear you got it to work.
As for using the Advanced options to get a list of values, you would need to change the data that you are trying to capture from a single value to a list/table. You would then get extra CSS selectors to define.
If you capture it as a list first and then click on Advanced to modify, it's a bit easier.
-------------------------------------------------------------------------
If I have answered your question, please mark it as the preferred solution. If you like my response, please give it a Thumbs Up.
I also provide paid consultancy and development services using Power Automate. If you're interested, DM me and we can discuss it.
Hello Agnius, thank you for your answer
It worked! You were right, I was able to find an <anchor> element and from there select the HRef option from the information. I also tried using the Advanced Options and do it how you explained and, although it sort of worked, I was only able to extract one link, whenever I tried to select a second one it would just replace the first and leave me with just one. Do you know how could I select all of them through the Advanced options like you said? I would really appreciate it.
Thanks once again,
Telmo Ferreira
This is because you're targeting the wrong element. The <span> element that holds the text does not have a href. It's the parent <a> element that includes the date AND the text that has the href behind it:
Try capturing the entire row and then the href option will appear. If you are not able to capture that via the helper, try to capture the <span> element, but then go to the Advanced options and remove the span part at the end of the selector to focus on the <a> element.
-------------------------------------------------------------------------
If I have answered your question, please mark it as the preferred solution. If you like my response, please give it a Thumbs Up.
I also provide paid consultancy and development services using Power Automate. If you're interested, DM me and we can discuss it.
eetuRobo
4
Super User 2025 Season 1
KO-05050229-0
2
stampcoin
2