Skip to main content

Notifications

Community site session details

Community site session details

Session Id :
Power Apps - Microsoft Dataverse
Unanswered

Why aren't dataflows atomic?

(0) ShareShare
ReportReport
Posted on by 19

Hello,

 

I am using a dataflow to load data from a bunch of JSON files into CDS. The dataflow contains multiple queries, the results of most of these queries will be marked as "Do not load", only a few queries have the final data that gets loaded into my entities (tables in CDS containing those entities, really). This works fine most of the time.

 

The problem is that sometimes one of these queries fails. My expectation was that when this happens, none of the queries will load their result into CDS. However, this is not the case. All the queries but the failing one will load data into CDS. Which makes little sense, e.g. for a set of 4 queries where 1 query is for a parent entity and the other 3 are for child entities. Why load the children into CDS if the parent could not be loaded?

 

Is this the expected behavior? Will this ever be corrected?

 

Regards,

Levente

Categories:
  • EricRegnier Profile Picture
    8,714 Most Valuable Professional on at
    Re: Why aren't dataflows atomic?

    Hi @Levente, with CDS Dataflows, you can only update one target CDS entity per flow. So if you sync the Parent entity A and then Child entity B, that's 2 entities which you'll need 2 dataflows...

  • Levente Profile Picture
    19 on at
    Re: Why aren't dataflows atomic?

    Hello Eric,

     

    You wrote "Also, make sure the run schedule between the parent and child dataflows are far apart and include buffer to give the parent dataflow a chance to complete, before the child flows kicks in...".

     

    Wait, what? Why would I have different dataflows to load the parent and child entities? I am using different queries (and sometimes a lot of them), part of the same dataflow to do this. One dataflow loads the parent and all its children. If you think I am doing something wrong, please elaborate.

     

    Regards,

    Levente

  • Levente Profile Picture
    19 on at
    Re: Why aren't dataflows atomic?

    Hello Kris,

     

    I do not need a condition per se, I would be content with either all, either none of the queries mapped to load data in CDS entities to succeed.

     

    I understand that I can achieve this by going with some third party stuff (although that gets monster complicated - and expensive - for my projects), or by doing my ETL with Azure Data Factory (although in this case I have no idea how to make the loaded data to end up in CDS), but a simple solution that is part of PowerApps is what I would prefer.

     

    Regards,

    Levente

  • Levente Profile Picture
    19 on at
    Re: Why aren't dataflows atomic?

    Ok, I understand what you are suggesting. This is exactly how I have my child entities set up. But this only solves the all-succeeding-queries-load-to-CDS issue for this particular case. Sometimes there are valid reasons why the entities are only logically related, without a lookup field between them. Let me give you another example.

     

    I have a project that is loading a lot of data from JSON files. To avoid processing all the JSON files in monster queries every time the dataflow is refreshed, I log the files I processed in a ProcessedFiles entitity, then (one of) the first query in the dataflow will intersect this processed files table with the list of files to import from, and the following queries will only load data from the new files. However, if (any of) the query that loads the data from those files in CDS fails, I do not want the query that loads data to the ProcessedFiles entity to load either, because that would mark the files as processed, although I will probably want to retry them once they get corrected. And no, I cannot add a parent child relationship here, because for older data I no longer have the files I loaded them from.

     

    The atomicity of the dataflow would be a useful thing to have. I created a new PA idea for this, please vote if you think it would be useful.

  • v-xida-msft Profile Picture
    on at
    Re: Why aren't dataflows atomic?

    Hi @Levente ,

    Hope I understand your scenario correctly here -- you mean you do not want the query (writes the new files to the ProcessedFiles entity) to be executed when the query that loads the data from the files fails. Is it right?

     

    If above is your issue, I afraid that there is no way to achieve your needs in PowerApps currently. Currently, there is no way to add a condition in a data flow to control the execution process.

     

    If you have some issue with mapping child records with parent records when you execute data flow, I agree with @EricRegnier 's thought almost. Please refer to his solution, check if it is helpful in your scenario.

     

    Regards,

  • EricRegnier Profile Picture
    8,714 Most Valuable Professional on at
    Re: Why aren't dataflows atomic?

    Hi @Levente,

    To make sure I understand your scenario right, you have 4 dataflows, 1 for the parent entity and the other for the child entities? And you don't want the child records to get created if the parent does not exist in CDS? If so, then you can achieve that behavior with Dataflows if you configure the mappings correctly. Here are the things you'll need to do to enable this:

    1. Make sure your child entity has a lookup field that relates to your parent entity.
    2. Have an alternate key defined on you parent entity. This will allow you to map the parent lookup field on your child dataflow's field mappings
    3. Modify you child dataflow field mappings to map the parent lookup field

    So when the child dataflows run and tries to create child records, if the system cannot find the parent record to set the lookup, that row will fail and so you won't be left with orphan child records. You will be able to see which rows succeeded and failed by looking at the history of the last refresh (clicking the little warning icon).

    For more info, here's a nice article that explains it more in details: https://dynamicscitizendeveloper.com/2020/04/21/how-to-map-a-lookup-field-in-a-power-platform-dataflow/

     

    Also, make sure the run schedule between the parent and child dataflows are far apart and include buffer to give the parent dataflow a chance to complete, before the child flows kicks in...

    Hope this helps a little...

  • Joel CustomerEffective Profile Picture
    3,224 on at
    Re: Why aren't dataflows atomic?

    If you want to have a automatic data load approach that gives you the ability to not load the child records of the first step fails, consider using ssis with the kingswaysoft adapter (www.kingswaysoft.com) or azure data factory. Or logic apps. Or power automate if it is a smaller integration

  • Levente Profile Picture
    19 on at
    Re: Why aren't dataflows atomic?

    Stopping the remaining queries in a dataflow after one query fails is not my main concern, although sure, why waste time with them if the result would not be used? My problem is that is a query fails, I don't want any change made to the CDS, so either all queries I designated to load data in CDS do so, either none does.

     

    Let me give you another example. I am loading data from JSON files. To avoid my dataflow getting slower and slower as files pile up, I keep track of the files I processed (loaded data from) and will not process them again. I have a ProcessedFiles entity in CDS where I store the list of files. One of the queries at the beginning of the dataflow will filter out from the input files the ones I already processed, then subsequent queries will only load data from the new files. Finally the dataflow loads new entities to CDS for both the data loaded and the files I loaded them from. However, if the query that loads the data from the files fails, the query that writes the new files to the ProcessedFiles entity still executes. 

     

    I will add a feature suggestion for this. Thanks!

  • v-xida-msft Profile Picture
    on at
    Re: Why aren't dataflows atomic?

    Hi @Levente ,

    Do you want to stop the child queries from executing when the parent query fails in your data flow?

     

    If you want to stop the child queries from executing when the parent query fails in your data flow, I afraid that there is no way to achieve your needs in PowerApps currently.

     

    Currently, within data flow, there is no way to configure option to  stop the child queries from executing when the parent query fails. If you would like this feature to be added in PowerApps, please consider submit an idea to PowerApps Ideas Forum:

    https://powerusers.microsoft.com/t5/Power-Apps-Ideas/idb-p/PowerAppsIdeas

     

    Best regards,

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Announcing the Engage with the Community forum!

This forum is your space to connect, share, and grow!

🌸 Community Spring Festival 2025 Challenge Winners! 🌸

Congratulations to all our community participants!

Warren Belz – Community Spotlight

We are honored to recognize Warren Belz as our May 2025 Community…

Leaderboard > Power Apps - Microsoft Dataverse

#1
mmbr1606 Profile Picture

mmbr1606 22 Super User 2025 Season 1

#2
stampcoin Profile Picture

stampcoin 19

#3
Michael E. Gernaey Profile Picture

Michael E. Gernaey 15 Super User 2025 Season 1

Overall leaderboard

Featured topics