web
You’re offline. This is a read only version of the page.
close
Skip to main content

Announcements

News and Announcements icon
Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Power Automate / Find duplicates in lar...
Power Automate
Answered

Find duplicates in large Sharepoint list with more than 5000 items and update column

(0) ShareShare
ReportReport
Posted on by 21

Hi there,

I am strugling to find a way to build a working flow for below scenario.

I have to build a flow which finds duplicates in a Sharepoint list that have more than 5000 items based on Concatenate value between 3 columns (Title, Request type, Title2) and then update column Duplicate with value Yes.

Thank you for your help!

 

Categories:
I have the same question (0)
  • schwibach Profile Picture
    2 Moderator on at

    There are a few ways to go about this.

     

    One way would be to use the get items action to get all the items of the list.

    Then loop through the items and use a filter action on the list with title, then filter that for request type and then filter that for title2. Then check if length is greater than 1, if yes, update item column duplicate yes, in not,then set the duplicate column to no.

    It will take some time, but it'll do the trick.

  • royg Profile Picture
    on at

    Hi @SuperPowerUser1,

    As mentioned by @schwibach, this is a complicated scenario where you may have to iterate all the items in your list. If that flow runs too often (for example on a scheduled basis, it can get tagged as causing poor performance and eventually stopped.

    what you could do to improve performance is use XPATH to count the occurrences of items with the same key (the concatenation of Title, Request type and Title2) but different ID. With that array filtered to only items with duplicates greater than 0, you can iterate probably far less items and modify them, or better yet, use Batch action to update them as duplicates. Note that you need to implement logics to locate items that were tagged as duplicates but are no longer (since they were either changed or their duplicates were removed) and tag them as non duplicates. Again, it's better to use Batch updates for that.

    You can also consider handling the On item created/modified event to count duplicates of the current item and set its field respectively. If you do that than most of your items will have the correct value, but you'll need to fix it using a daily run to handle deleted items. 

  • SuperPowerUser1 Profile Picture
    21 on at

    Hi @royg 

    Sounds like a very complicated task 🙂

    I have never used XPATH or batch processing before so i will have to do more research on these actions.

    The flow will run When an item is created so this will be quite often.

    I was hoping there is a more simple way to achieve this. Will also try the solution suggested by @schwibach to see what is the performance.

     

    Thank you!

  • Verified answer
    schwibach Profile Picture
    2 Moderator on at

    @royg  I'm curious how Xpath would help with that. Maybe you can elaborate.

     

    For performance I would concatenate the strings in a select action and then filter the array that the select action returns and count those rows.

    That'll be faster than the cascading filter actions.

     

    @SuperPowerUser1 
    I thought you would have to handle all the existing items in the SharePoint list.

    If that is not the case you could (and should) do an odata filter on the get items action and only return the items that match all three fields. If that array has more than one items, you should update the row and mark it as a duplicate. However, that will not have any impact on your exisiting rows.

  • royg Profile Picture
    on at

    Again, I agree with @schwibach,

    When handling the Create item event, you definitely need to use OData filter to query all items with the same 3 values + unequal ID. The returned items + the newly created item need to be updated as duplicates. 

    As @schwibach mentioned, this won't update existing duplicates and moreover, this doesn't handle the changes in any of the 3 fields in existing items (this can create duplicates or remove them) and removal of items (which can also remove duplicates).

     

    Using XPATH will allow creating a filtered array of all items that currently have duplicates without iterating the items:

    royg_0-1706473637934.png

    concat(item()?['Title'],item()?['Requesttype'],item()?['Title2'])

    royg_1-1706473681523.pngroyg_2-1706473717691.png

    xml(outputs('Compose_XMLPrep'))

    royg_3-1706473768798.png

    xpath(outputs('Compose_XML'), concat('count(//root/array[not(ID="', item()?['ID'], '") and VALUE="', item()?['VALUE'], '"])'))

    royg_4-1706474138136.png

     

    The Filter array results are all the duplicates in your list. Now you can use the Batch update process as documented here:

    https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/Batch-Update-Create-and-Upsert-SharePoint-Lists/td-p/1365410

     

    But this is half of the process. You still need to get all items that were marked as duplicates and are no longer duplicates (that means they have IsDuplicate = TRUE but they don't exist in the filter results). I won't add that here but if you need help with that, reach out and I'll try to assist.

     

    Hope this helps.

  • takolota1 Profile Picture
    4,980 Moderator on at

    There are ways to find duplicates without any xpath or looping:

    https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/Find-and-Remove-Duplicates/td-p/2191403
    (I have recently built even more efficient processing steps than those in the video. It’s included in the solution package import.)

     

    _____________________________________________________________________________

    Subscribe to my YouTube channel

  • Chriddle Profile Picture
    8,672 Super User 2026 Season 1 on at

    Two ways of checking for duplicates:

    Chriddle_0-1706541164202.png

    Select creates an array of 5000 random items

     

    1. (Left branch)

    Select 2

    From:

     

    range(0, length(body('Select')))

     

    Map:

     

    addProperty(
    	body('Select')[item()],
    	'Duplicate',
    	or(
    		and(
    			not(equals(item(), 0)),
    			contains(take(body('Select'), sub(item(), 1)), body('Select')[item()])
    		),
    		contains(skip(body('Select'), add(item(), 1)), body('Select')[item()])
    	)
    )

     

     

    2. (Right branch)

    Compose

     

    xml(json(concat('{"Root":{"Item":', body('Select'), '}}')))

     

    Select 2

    From:

     

    body('Select')

     

    Map:

     

    addProperty(
    	item(),
    	'Duplicate',
    	greater(
    		xpath(
    			outputs('Compose'),
    			concat('count(//Item[Title="', item()['Title'], '" and RType="', item()['RType'], '" and Title2="', item()['Title2'], '"])'
    			)
    		),
    		1
    	)
    )

     

     

    Result

    Both branches result in the original array with an additional property that indicates a duplicate

     

     

    {
     "body": [
     {
     "Title": "Title-19",
     "RType": "RType-213",
     "Title2": "Title2-9",
     "Duplicate": false
     },
     {
     "Title": "Title-5",
     "RType": "RType-179",
     "Title2": "Title2-6",
     "Duplicate": false
     },
     {
     "Title": "Title-22",
     "RType": "RType-303",
     "Title2": "Title2-2",
     "Duplicate": false
     },
     {
     "Title": "Title-11",
     "RType": "RType-106",
     "Title2": "Title2-7",
     "Duplicate": false
     },
    ...

     

    where the left branch is slightly faster

    Chriddle_1-1706541711716.png

     

  • SuperPowerUser1 Profile Picture
    21 on at

    Thank you all! 

    I have build the flow today as @schwibach  suggested.

    Odata filter on Get items matching the 3 columns.

    Then check if the lenght of the output array is greater than 1.

    If it is, then update the column Duplicate to Yes.

    It is updating only the newly created item but it is what i was trying to achieve anyway.

     

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Introducing the 2026 Season 1 community Super Users

Congratulations to our 2026 Super Users!

Kudos to our 2025 Community Spotlight Honorees

Congratulations to our 2025 community superstars!

Congratulations to the March Top 10 Community Leaders!

These are the community rock stars!

Leaderboard > Power Automate

#1
Haque Profile Picture

Haque 605

#2
Valantis Profile Picture

Valantis 340

#3
11manish Profile Picture

11manish 284

Last 30 days Overall leaderboard