
Hello people,
i do have a source dataset that is generated from user inputs and therefore need to be validated before i can pass the dataset for analyzing.
There are mostly two possible type of errors:
- logical erros ( [Field1] = 0, [Field2] <> 0)
- lookup errors ([Field3] value not in lookup/ list
I wrote a powerquery M script that validates the data in three steps.
1. "if then" statements to find logical erros, and "Table.Contains" functions to find data thats not in the lookup tables.
i.e.
E20 = if [Field1] = 0 and [Field2] <> 0 then 20 else null,
E21 = if [Field3] = x and [Field4] <> y then 21 else null,
E22 = if Table.Contains(tab_categories, [category = [out_category]]) = false then 22 else null,
2. create a result list of all the conditions/ if clauses and remove null values
results = List.RemoveNulls({E20,E21,E22)
3. Translate those results into error messages by looking up each element of the result list in a tab_errorcodes and combine the texts:
The script has to handle approx. 2k Datarows but seems to be very slow and produces timeout errors at some point.
I am looking for tips/ hints to learn new ways/ best practices to get the task done and would love to hear some feedback to my approach.
Greetings,
S
Hi Gao,
thank you for your Feedback and the input regarding the table buffering.
I will look to implement table buffering in my solution and trying to imporve my overall flow setup.
Greetings,
S