I have a datasource I'm using to load data into a Dataverse table but am finding it creates duplicate records with each import since the original records still exist in the source table. What is the best way to prevent the creation of duplicate records?
Here's an example with some sample data:
I know I could do a standard Merge step that only include items from the source not present in the Dataverse table, but the issue is a value for an existing record could theoretically change between imports, so I'm wanting to validate data in the most performant way possible.
Also, @MischaSz and @romanduda: in the future when you have issues like this, I suggest you create a new forum post, and just give people a link back to the old one. This post is already resolved, so most people on the forums will just ignore it. By creating a new post, you will get the attention of more users who can help you out, faster.
If you want the dataflow to always yield an exact copy, inclusive of delete operations, then what you are looking for is a FULL refresh instead of an INCREMENTAL refresh. Full will completely sync the target table with the source to make them identically match every time. Incremental will only push incremental changes. But you will also need to check the box in your dataflow designer that says, "Delete Rows that no longer exist in the query output"
See this old forum post here for more info on refresh behavior and deleting: Solved: Dataflows - Refresh behavior - Power Platform Community (microsoft.com)
I have the same question. My table should be an exact copy of the datasource (which are multiple price lists merged together). The app I want to create is a lookup tool for those merged price lists. I'm currently getting duplicates with every refresh. I understand that I can solve this by creating keys from some columns. However, I can't find out how I can automatically remove rows from the table if they do not exist in the datasource anymore.
Hey cchannon,the update of the rows works fine. After a refresh only the new rows were added but the refresh ended up in an error. It also do not delete rows which where deleted in the datasource. Do you have any advice for it?
Best regards Roman
That's awesome! Unfortunately, I'm currently using a Dataverse for Teams table, which apparently doesn't allow users to define keys in tables. 😢
Great to know this is possible, though. Much appreciated.
Yes, it is different from the default Primary Column because the default does not enforce uniqueness, whereas a configured Key does.
To set one up, just navigate to the table you want to configure in the make.powerapps UI and you'll find Keys in the left nav. Select that, then create a new key, and you can then select all the columns you want to use as a composite to form a unique key. For example, in the screenshot below, I have a table called Assignment and I am creating a key that enforces uniqueness by a combination of Name, Due, and Type:
As far as linking them together with your import, that's actually the easy part! Dataflow automatically detects that there is a Key associated to the table you are importing and takes the field mapping you provide to build that key and validate it against what is already in the table. If the key for the imported row already exists in the table, Dataverse does an Update, otherwise it Inserts (and you don't need to set up any of the logic - it just works!)
Thanks for the reply, @cchannon. Very interesting. Is that different from the default key that is created for Dataverse tables (e.g., Name)? Also, how do these map onto items being imported? If, say, the data is coming from an Excel spreadsheet, how would those rows be identified, or is that handled behind the scenes by the dataflow?
Make sure your target table has a Key set up that uniquely identifies each row (maybe just the name? or a composite of name and type?) Dataflow will pick up on the key on the table and for all the rows to be imported it will use the key to upsert instead of just inserting.
WarrenBelz
146,771
Most Valuable Professional
RandyHayes
76,287
Super User 2024 Season 1
Pstork1
66,091
Most Valuable Professional