web
You’re offline. This is a read only version of the page.
close
Skip to main content

Notifications

Announcements

Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Power Automate / WebScraping to Excel _...
Power Automate
Unanswered

WebScraping to Excel _Cleaning de Data

(1) ShareShare
ReportReport
Posted on by 6
Dear all,
 
Atached you can find my flow, but i can not clean the data properlly:
This is the result tha i need:
 
thank you for your support
Regards
 
 
I have the same question (0)
  • BoredFish Profile Picture
    26 on at
    Try this:
     
    WebAutomation.LaunchChrome.LaunchChrome Url: $'''www.bde.es''' WindowState: WebAutomation.BrowserWindowState.Maximized ClearCache: False ClearCookies: False WaitForPageToLoadTimeout: 60 Timeout: 60 PiPUserDataFolderMode: WebAutomation.PiPUserDataFolderModeEnum.AutomaticProfile BrowserInstance=> Browser
    Excel.LaunchExcel.LaunchUnderExistingProcess Visible: True Instance=> ExcelInstance
    SET CabeceraColumnas TO {['Concepto', 'Fecha', 'Valores'] }
    Excel.WriteToExcel.WriteCell Instance: ExcelInstance Value: CabeceraColumnas Column: $'''A''' Row: 1
    WebAutomation.ExtractData.ExtractList BrowserInstance: Browser Control: $'''html > body > section > section:eq(7) > div > div > div:eq(0) > div > div:eq(1) > div > article''' ExtractionParameters: {[$'''div > div > p:eq(0)''', $'''Own Text''', $''''''] } PostProcessData: False TimeoutInSeconds: 60 ExtractedData=> ConceptosActual
    SET Fila TO 2
    LOOP FOREACH ConceptosActual IN ConceptosActual
        Text.Trim Text: ConceptosActual TrimOption: Text.TrimOption.Both TrimmedText=> ConceptosActual
        Excel.WriteToExcel.WriteCell Instance: ExcelInstance Value: ConceptosActual Column: 1 Row: Fila
        Variables.IncreaseVariable Value: Fila IncrementValue: 1
    END
    WebAutomation.ExtractData.ExtractList BrowserInstance: Browser Control: $'''html > body > section > section:eq(7) > div > div > div:eq(0) > div > div:eq(1) > div > article''' ExtractionParameters: {[$'''div > div > p:eq(2)''', $'''Own Text''', $''''''] } PostProcessData: False TimeoutInSeconds: 60 ExtractedData=> FechaActual
    SET Fila TO 2
    LOOP FOREACH FechaActual IN FechaActual
        Text.ParseText.RegexParseForFirstOccurrence Text: FechaActual TextToFind: $'''(\\d{1,2}\\/\\d{1,2}\\/\\d{2,4})|(\\d{1,2}\\\\d{1,2}\\\\d{2,4})|(\\d{1,2}-\\d{1,2}-\\d{2,4})|(\\d{1,2}.\\d{1,2}.\\d{2,4})''' StartingPosition: 0 IgnoreCase: True Match=> Match
        **REGION Date Format Handler
        Text.ConvertTextToDateTime.ToDateTime Text: Match DateTime=> TextAsDateTime
        ON ERROR
            GOTO EuroDateFormat
        END
        GOTO NAdateFormat
        LABEL EuroDateFormat
        Text.ConvertTextToDateTime.ToDateTimeCustomFormat Text: Match CustomFormat: $'''dd/MM/yyyy''' DateTime=> TextAsDateTime
        LABEL NAdateFormat
        **ENDREGION
        Text.ConvertDateTimeToText.FromDateTime DateTime: TextAsDateTime StandardFormat: Text.WellKnownDateTimeFormat.ShortDate Result=> FormattedDateTime
        Excel.WriteToExcel.WriteCell Instance: ExcelInstance Value: FormattedDateTime Column: 2 Row: Fila
        Variables.IncreaseVariable Value: Fila IncrementValue: 1
    END
    WebAutomation.ExtractData.ExtractList BrowserInstance: Browser Control: $'''html > body > section > section:eq(7) > div > div > div:eq(0) > div > div:eq(1) > div > article''' ExtractionParameters: {[$'''div > div > p:eq(1)''', $'''Own Text''', $''''''] } PostProcessData: False TimeoutInSeconds: 60 ExtractedData=> ValorActual
    SET Fila TO 2
    LOOP FOREACH ValorActual IN ValorActual
        Text.Trim Text: ValorActual TrimOption: Text.TrimOption.Both TrimmedText=> ValorActual
        Text.Replace Text: ValorActual TextToFind: $'''%' '%''' IsRegEx: False IgnoreCase: False ReplaceWith: $'''%''%''' ActivateEscapeSequences: False Result=> ValorActual
        Text.Replace Text: ValorActual TextToFind: $''',''' IsRegEx: False IgnoreCase: False ReplaceWith: $'''.''' ActivateEscapeSequences: False Result=> ValorActual
        IF Contains(ValorActual, $'''%'$'%''', False) THEN
            Text.Replace Text: ValorActual TextToFind: $'''%'$'%''' IsRegEx: False IgnoreCase: False ReplaceWith: $'''%''%''' ActivateEscapeSequences: False Result=> ValorActual
            SET ValorActual TO $'''$%ValorActual%'''
        END
        Excel.WriteToExcel.WriteCell Instance: ExcelInstance Value: ValorActual Column: 3 Row: Fila
        Variables.IncreaseVariable Value: Fila IncrementValue: 1
    END
    # Inicio Fecha Actual
    DateTime.GetCurrentDateTime.Local DateTimeFormat: DateTime.DateTimeFormat.DateAndTime CurrentDateTime=> FechaHoraActual
    Text.ConvertDateTimeToText.FromCustomDateTime DateTime: FechaHoraActual CustomFormat: $'''ddMMyyyyHHmmss''' Result=> FechaHoraActual
    # Fin Fecha Actual
    # Inicio RutaArchivo
    Folder.GetSpecialFolder SpecialFolder: Folder.SpecialFolder.DesktopDirectory SpecialFolderPath=> RutaEscritorio
    Text.CropText.CropTextBetweenFlags Text: RutaEscritorio FromFlag: $'''C:\\Users\\''' ToFlag: $'''\\''' IgnoreCase: True CroppedText=> User
    SET RutaDownLoads TO $'''C:\\Users\\%User%\\Documents\\%FechaHoraActual%'''
    # Fin RutaArchivo
    Excel.CloseExcel.CloseAndSaveAs Instance: ExcelInstance DocumentFormat: Excel.ExcelFormat.OpenXmlWorkbook DocumentPath: RutaDownLoads
    WebAutomation.CloseWebBrowser BrowserInstance: Browser
     
  • LT-22071656-0 Profile Picture
    6 on at
    Dear BoredFish,
     
    Thank you for your help.
    How can I mantain the formar dd/MM/yyyy (Spanish Format) in the Excel File.
    Thanks in advance
     
    Regards
     
     

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Forum hierarchy changes are complete!

In our never-ending quest to improve we are simplifying the forum hierarchy…

Ajay Kumar Gannamaneni – Community Spotlight

We are honored to recognize Ajay Kumar Gannamaneni as our Community Spotlight for December…

Leaderboard > Power Automate

#1
Michael E. Gernaey Profile Picture

Michael E. Gernaey 462 Super User 2025 Season 2

#2
Tomac Profile Picture

Tomac 456 Moderator

#3
abm abm Profile Picture

abm abm 243 Most Valuable Professional

Last 30 days Overall leaderboard