Hi SpaceAnalytics,
Glad to hear that you were able to establish the connection successfully!
I would say that it depends on :
1. Will you be using the Dataverse tables that are also included by default?
2. How many Dataverse tables do you have?
3. How many PowerBI users do you have?
4. How many layers will your solution have (bronze, silver, gold) ?
The reason I'm asking is because if you have a lot of Dataverse tables which you're not planning to use, you'll have to design your entire development around explicitly declaring which Lakehouse objects you want to work it, and ignore the rest.
Also, I noticed that a Lakehouse which contains 800+ tables, causes heavy latency when working in the Lakehouse & SQL Endpoint (only tried in F2 capacity).
If you're in this scenario, my suggestion is to create a new Lakehouse and use notebooks or pipelines to replicate the tables you want to use to the new Lakehouse instead.
I also strongly recommend that you have separate workspaces for your solution layers, as the RBAC is currently on the Workspace level.
I have also described in this thread on how to remove Dataverse tables from the "Microsoft OneLake" Synapse Link which gets created when you use the "Link to Fabric" functionality.
However since I have not yet received confirmation from Microsoft on whether this is intentional functionality or not, I do not recommend moving forward with it right now.
All of that being said, the CSV approach to a separate ADLS Gen2 does certainly require a bigger effort regarding development and maintenance.
During a Fabric AMA session with the Azure Fasttrack engineers, I raised this question with them.
The response was that they do not officially advise from not using this approach, but their recommendation was that in this specific case, try to avoid using spark notebooks for converting from CSV to Delta as this is compute intensive (use pipelines instead).
This approach does certainly also entail more compute usage as you're not longer getting your data into Delta Parquet format "for free" as you do with the Link to Fabric approach.
In short: Yes, I would agree that option 1 is the overall best solution and it's clearly the desired architecture from Microsoft's point of view. However I would recommend that you take your data volume, ETL process and number of end-users into consideration since if you have a lot of data with heavy ETL processes and a lot of end-users, you might want to consider creating two separate Fabric capacities.
For example, instead of using F64 for your entire solution, you might want to consider using F32 for ETL and F16 for PowerBI datasets.
On the option 1 note, I've received another reply from the Microsoft Dynamics / Finance and Operations support engineers:
"
We have the following updates from the engineering team.
We do have work items planned to reduce the sync latency. However we are unable to provide an estimated time of completion.
Kindly note that , currently , there isn't much we can do to reduce the overall latency.
Appreciate your understanding and cooperation in this matter.
"
My follow-up question was:
"
Thank you for the update and information, very much appreciated!
Can I interpret the response from the engineering team, that the current latency solely depends on the size of the source table?
Or is there any risk that the latency could increase depending on the number of tables in our “Microsoft OneLake” Synapse Link?
On this subject, according to your documentation, all Dataverse tables which have “Track changes” enabled are included by default when using “Link to Fabric”.
Is it possible to manually remove individual tables from the Dataverse source, if we go to Manage Tables in the “Microsoft OneLake” Synapse Link?
"
Let me know if you have more questions.
Best of luck!