I feel I may have hit a dead-end with what I am trying to achieve but thought maybe some clever person out there has a solution.
I am trying to automate (has to be runnable in the cloud) the retrieval of a file from one our 3rd party suppliers websites, a link to which is sent to me in an email. The website requires a login so I need a few steps to get to the download:
- HTTP GET on the url in the email
- The server responds with a redirect to the login page with the original url encoded as a parameter
- HTTP GET on the login page (ie carry out the redirect)
- HTTP POST on the login page with credentials added to the payload
- HTTP GET on the original url to retrieve the file
I'm extracting and reusing cookies at the various steps but the problem I am getting is that the final GET on the file, fails with another redirect to the login page. After quite a bit of digging I have worked out that the website is using Cloudflare and from what I can see when analysing the transactions that happen if I do this manually with Chrome, the server is injecting extra javascript into the login page that then does another POST to some challenge tools on the server (which I think are installed as part of using Cloudflare).
So I think I'm stumped because afaik there is no way to get HTTP (or any other action) to run javascript. Has anyone managed overcome anything similar to this?