web
You’re offline. This is a read only version of the page.
close
Skip to main content
Community site session details

Community site session details

Session Id :
Power Automate - Power Automate Desktop
Unanswered

WebScraping to Excel _Cleaning de Data

(1) ShareShare
ReportReport
Posted on by 6
Dear all,
 
Atached you can find my flow, but i can not clean the data properlly:
This is the result tha i need:
 
thank you for your support
Regards
 
 
Categories:
I have the same question (0)
  • BoredFish Profile Picture
    25 on at
    WebScraping to Excel _Cleaning de Data
    Try this:
     
    WebAutomation.LaunchChrome.LaunchChrome Url: $'''www.bde.es''' WindowState: WebAutomation.BrowserWindowState.Maximized ClearCache: False ClearCookies: False WaitForPageToLoadTimeout: 60 Timeout: 60 PiPUserDataFolderMode: WebAutomation.PiPUserDataFolderModeEnum.AutomaticProfile BrowserInstance=> Browser
    Excel.LaunchExcel.LaunchUnderExistingProcess Visible: True Instance=> ExcelInstance
    SET CabeceraColumnas TO {['Concepto', 'Fecha', 'Valores'] }
    Excel.WriteToExcel.WriteCell Instance: ExcelInstance Value: CabeceraColumnas Column: $'''A''' Row: 1
    WebAutomation.ExtractData.ExtractList BrowserInstance: Browser Control: $'''html > body > section > section:eq(7) > div > div > div:eq(0) > div > div:eq(1) > div > article''' ExtractionParameters: {[$'''div > div > p:eq(0)''', $'''Own Text''', $''''''] } PostProcessData: False TimeoutInSeconds: 60 ExtractedData=> ConceptosActual
    SET Fila TO 2
    LOOP FOREACH ConceptosActual IN ConceptosActual
        Text.Trim Text: ConceptosActual TrimOption: Text.TrimOption.Both TrimmedText=> ConceptosActual
        Excel.WriteToExcel.WriteCell Instance: ExcelInstance Value: ConceptosActual Column: 1 Row: Fila
        Variables.IncreaseVariable Value: Fila IncrementValue: 1
    END
    WebAutomation.ExtractData.ExtractList BrowserInstance: Browser Control: $'''html > body > section > section:eq(7) > div > div > div:eq(0) > div > div:eq(1) > div > article''' ExtractionParameters: {[$'''div > div > p:eq(2)''', $'''Own Text''', $''''''] } PostProcessData: False TimeoutInSeconds: 60 ExtractedData=> FechaActual
    SET Fila TO 2
    LOOP FOREACH FechaActual IN FechaActual
        Text.ParseText.RegexParseForFirstOccurrence Text: FechaActual TextToFind: $'''(\\d{1,2}\\/\\d{1,2}\\/\\d{2,4})|(\\d{1,2}\\\\d{1,2}\\\\d{2,4})|(\\d{1,2}-\\d{1,2}-\\d{2,4})|(\\d{1,2}.\\d{1,2}.\\d{2,4})''' StartingPosition: 0 IgnoreCase: True Match=> Match
        **REGION Date Format Handler
        Text.ConvertTextToDateTime.ToDateTime Text: Match DateTime=> TextAsDateTime
        ON ERROR
            GOTO EuroDateFormat
        END
        GOTO NAdateFormat
        LABEL EuroDateFormat
        Text.ConvertTextToDateTime.ToDateTimeCustomFormat Text: Match CustomFormat: $'''dd/MM/yyyy''' DateTime=> TextAsDateTime
        LABEL NAdateFormat
        **ENDREGION
        Text.ConvertDateTimeToText.FromDateTime DateTime: TextAsDateTime StandardFormat: Text.WellKnownDateTimeFormat.ShortDate Result=> FormattedDateTime
        Excel.WriteToExcel.WriteCell Instance: ExcelInstance Value: FormattedDateTime Column: 2 Row: Fila
        Variables.IncreaseVariable Value: Fila IncrementValue: 1
    END
    WebAutomation.ExtractData.ExtractList BrowserInstance: Browser Control: $'''html > body > section > section:eq(7) > div > div > div:eq(0) > div > div:eq(1) > div > article''' ExtractionParameters: {[$'''div > div > p:eq(1)''', $'''Own Text''', $''''''] } PostProcessData: False TimeoutInSeconds: 60 ExtractedData=> ValorActual
    SET Fila TO 2
    LOOP FOREACH ValorActual IN ValorActual
        Text.Trim Text: ValorActual TrimOption: Text.TrimOption.Both TrimmedText=> ValorActual
        Text.Replace Text: ValorActual TextToFind: $'''%' '%''' IsRegEx: False IgnoreCase: False ReplaceWith: $'''%''%''' ActivateEscapeSequences: False Result=> ValorActual
        Text.Replace Text: ValorActual TextToFind: $''',''' IsRegEx: False IgnoreCase: False ReplaceWith: $'''.''' ActivateEscapeSequences: False Result=> ValorActual
        IF Contains(ValorActual, $'''%'$'%''', False) THEN
            Text.Replace Text: ValorActual TextToFind: $'''%'$'%''' IsRegEx: False IgnoreCase: False ReplaceWith: $'''%''%''' ActivateEscapeSequences: False Result=> ValorActual
            SET ValorActual TO $'''$%ValorActual%'''
        END
        Excel.WriteToExcel.WriteCell Instance: ExcelInstance Value: ValorActual Column: 3 Row: Fila
        Variables.IncreaseVariable Value: Fila IncrementValue: 1
    END
    # Inicio Fecha Actual
    DateTime.GetCurrentDateTime.Local DateTimeFormat: DateTime.DateTimeFormat.DateAndTime CurrentDateTime=> FechaHoraActual
    Text.ConvertDateTimeToText.FromCustomDateTime DateTime: FechaHoraActual CustomFormat: $'''ddMMyyyyHHmmss''' Result=> FechaHoraActual
    # Fin Fecha Actual
    # Inicio RutaArchivo
    Folder.GetSpecialFolder SpecialFolder: Folder.SpecialFolder.DesktopDirectory SpecialFolderPath=> RutaEscritorio
    Text.CropText.CropTextBetweenFlags Text: RutaEscritorio FromFlag: $'''C:\\Users\\''' ToFlag: $'''\\''' IgnoreCase: True CroppedText=> User
    SET RutaDownLoads TO $'''C:\\Users\\%User%\\Documents\\%FechaHoraActual%'''
    # Fin RutaArchivo
    Excel.CloseExcel.CloseAndSaveAs Instance: ExcelInstance DocumentFormat: Excel.ExcelFormat.OpenXmlWorkbook DocumentPath: RutaDownLoads
    WebAutomation.CloseWebBrowser BrowserInstance: Browser
     
  • LT-22071656-0 Profile Picture
    6 on at
    WebScraping to Excel _Cleaning de Data
    Dear BoredFish,
     
    Thank you for your help.
    How can I mantain the formar dd/MM/yyyy (Spanish Format) in the Excel File.
    Thanks in advance
     
    Regards
     
     

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Responsible AI policies

As AI tools become more common, we’re introducing a Responsible AI Use…

Chiara Carbone – Community Spotlight

We are honored to recognize Chiara Carbone as our Community Spotlight for November…

Leaderboard > Power Automate

#1
Michael E. Gernaey Profile Picture

Michael E. Gernaey 788 Super User 2025 Season 2

#2
Tomac Profile Picture

Tomac 452 Moderator

#3
developerAJ Profile Picture

developerAJ 302

Last 30 days Overall leaderboard