top of page

Challenge 031 | Failed Flow Runs

During the summer break I had some time to improve some governance work. I've built something to easily monitor failed flow runs, and improved my Power BI skills in that process. I want to share my findings and focus on the part that might be of interest for many Power Platform lovers.

Challenge Objectives

🎯 Understand the flow run table

🎯 Learn different ways to retrieve Dataverse data in Power BI

🎯 Improve monitoring your solutions

Introduction

Each Power Automate flow shows the run history on the details page. You also receive an e-mail when a flow failed. This is helpful functionality for the average maker, but the department or enterprise scaled solutions are deployed across environments because of a healthy ALM approach. We regularly use service accounts for these solution flows as the primary owner, which results in emails being sent to that mailbox.

Earlier this year Microsoft announced that the flow run history will be stored in Dataverse. Because of the summer break I thought it was a good moment to improve our monitoring of Center of Excellence kit flows and flows of solutions we maintain. We also added additional metrics (e.g. Azure resource recommendations), but for this challenge I will only focus on the failed flow runs.

How it started

My first thought was, I want to monitor some data, let's create a Power BI report. I know there is a Dataverse connector that can query Dataverse data. Some simple filtering, et voila.

Well, that didn't go as planned. If you open Power BI Desktop and use that Dataverse connector, you will not find the flowrun (the system name) table.

The reason for this is that the flow run table is an elastic table. These are technically different from Standard type Dataverse tables, as they are based on Azure Cosmos DB.

The reason Microsoft opted for an elastic table is probably because the flow runs are not part of a complex relational schema, can handle loads of flow runs in a spike, and can make use of the Time-to-Live functionality of Azure Cosmos DB which is great for short-lived historical data (28 days for flow runs).

Take two

So we need to find another way to query the data. There is also an OData option in Power BI, and Dataverse supports it. I had to find out if this is just for standard tables the case, or for all tables. Fortunately there is a tools section on the details page of every table. Here I found the API link to table data. If you click on it, you will see on OData output of the table. The URL is just the query of the table, and by default it will show the top 10 items ($top=10).

You might know OData from the filter queries in Power Automate. I can barely create these queries myself. My go-to tool for this is FetchXML Builder by Jonas Rapp, available in XRMToolBox. If you don't already have these installed, you should do that right away. Instructions are clear on the linked websites. As you can see, you are able to query the flowrun table. Nice!

For my scenario, I want to fetch the failed flow runs of the last 3 days, so my team can review this each day (including weekend days on a monday) during our daily stand-up. For now we can just select a date. We will make this dynamic in a later stage.

You can quite easily create the query with the interface that FetchXML Builder offers. I created a query as shown above. You can then run the query. You might not have any failed flows at the moment. You could create a simple flow like the one below in order to get some results. But be aware that it should be a solution aware flow and that it takes some time before the failed flow run will actually be available in the elastic table. Another good reason to add some overlap (3 days) in you monitoring.

Cool! We now need to get the OData query into Power BI. On the Power BI side it is pretty straight forward. Just click the OData connector. You need to enter a URL. You can use FetchXML Builder to convert your built query into an OData URL.

After pasting the URL you only need to specify the credentials you want to use to retrieve the data. You should select the Organizational account here and sign in with your account (or service account if you are in the same situation as I am in).

After signing in you can connect and your data will be available in Power BI.

Make it Dynamic

The next phase is to make things a bit more dynamic. We want the date to be dynamic, but also the environments. I expected that some UtcNow and DateTime functions would be available in M (or Power Query Formula Language), so I wanted to focus on making the call more dynamic across environments.

For another metric we wanted to monitor, I had to get data directly from an API. The authentication method of the API was OAuth client credentials. This practically means that I have to retrieve a token that has en expiration time of 60 seconds before I can use the actual API to retrieve some data. This is when I learned how to create a function in Power BI. Pretty cool stuff actually. I learned a lot from PBI Guy and Chris Webb's BI Blog. You can do this by going into the advanced editor.

Adding a ()=> before the let will transform the Power Query formula into a function that you can refer to in another query.

The idea was to create one function with just the EnvironmentURL as an input parameter. You can try it yourself by adjusting the function the the sample above. You will in the Queries section that your query changed into a function. If you want to retrieve that data using a function, all you will have to do is call the function and enter the input parameter.

The advantage of this is that when we want to adjust something in our queries, we only need to update the function. all steps will be exactly the same for all the different environments. If we add another environment, we only need to duplicate the query, update the environment URL and create / add it to an appended Query. On the image below you can see I added a Production environment and created a FailedFlowRuns table by combining tables. Super easy. You can see that we need to sign in again when we connect to another environment. Makes sense.

There is a catch to this. Quite a big one I found out. Once you are ready with your data model and created the report that you want, you need to publish it to a workspace. My goal was to automatically refresh the report daily half an hour before our daily stand up, so we don't have to fuddle with updating datasets. You can set this up in the online service of Power BI.

Issue #2

You can see there is some sort of error. After some research I found out that the Power BI Service cannot schedule refresh dynamic sources. Because we use a variable in our OData feed, it gives this error.

As mentioned, I also added data to our report based on an API. Here the resolution was to add the base URL statically, and the query parameters in a relative path. This way it can already retrieve the schema and you can provide your credentials for the service you want to connect with. The anatomy of the M function is shown below. Note that because you connect to an API I use the Web-Contents function, but the thing I want to highlight is the relative path and query parameters.

Web.Contents(
	"https://your.base.url/api",
	[
		RelativePath = "yourTable",
		Query =
		[
			endtime = DateTime.ToText(DateTimeZone.FixedUtcNow())
		]
	]

Because of this Relative Path functionality, I was able to make the API data work. That means we are on the right track. So now we need to find out how we can implement a similar approach to our OData.Feed() function. Luckily there are some great Power BI blogs out there and I found this blog.

The OData URL I created was like the one below. The select only retrieves the data I am interested in. Expand links in another table (Workflow) to fetch the name and GUID of the Power Automate Flow. The name we want to see, and the GUID is needed to create a direct link to the failed flow run. More on that later.

ODataSource = OData.Feed(
    "https://YOURENVIRONMENTURL.crm4.dynamics.com/api/data/v9.2/flowruns?$select=name,errormessage,status,endtime&$expand=Workflow($select=workflowidunique,name)&$filter=(status ne 'Succeeded' and endtime ge 2024-08-01T00:00:00)"
)

Because the base URL is a required field, we cannot make it dynamic. But the date filtering should be dynamic. So is we follow the instructions from the blog post, we end up with the following M function.

ODataSource = OData.Feed(
    "https://YOURENVIRONMENTURL.crm4.dynamics.com/api/data/v9.2/flowruns?$select=name,errormessage,status,endtime&$expand=Workflow($select=workflowidunique,name)&$filter=(status ne 'Succeeded' and endtime ge @endtimeInput)",
    null,
    [
		Implementation="2.0",
		Query = [
			#"@endtimeInput"= DateTime.ToText(DateTimeZone.RemoveZone(Date.AddDays(Date.StartOfDay(DateTimeZone.FixedUtcNow()), -3)), [Format="yyyy-MM-dd"])
		]
	]
)

As you can see, the change we made is that we now refer to endtimeInput in our URL, and specify it in the query section. We add some DateTime functions to get all the failed flow runs up until 3 days ago. This works as a charm. As you can see, I now get the one failed flow I created with the simple Power Automate flow.

We still need to do some Power Query stuff to make it as functional as we want. We need to expand the Workflow column, add the Environment Name because we will append the tables into one, create a link to the failed flow run, and parse the error message, because when there is an error message, this will be shown in JSON. A fully working M function will be as following.

let
	EnvironmentName = "Miguel Verweij's Environment",
	EnvironmentID= "c660f971-ea48-4359-96ee-41c4f7f24831",
	ODataSource = OData.Feed(
		"https://orge3e5e136.crm4.dynamics.com/api/data/v9.2/flowruns?$select=name,errormessage,status,endtime&$expand=Workflow($select=workflowidunique,name)&$filter=(status ne 'Succeeded' and endtime ge @endtimeInput)",
		null,
		[
			Implementation="2.0",
			Query = [
				#"@endtimeInput"= DateTime.ToText(DateTimeZone.RemoveZone(Date.AddDays(Date.StartOfDay(DateTimeZone.FixedUtcNow()), -3)), [Format="yyyy-MM-dd"])
			]
		]
	),
	RenameFlowrun = Table.RenameColumns(ODataSource,{{"name", "Flowrun"}}),
	ExpandWorkflow = Table.ExpandRecordColumn(RenameFlowrun, "Workflow", {"workflowidunique", "name"}, {"workflowidunique", "name"}),
	AddLink = Table.AddColumn(ExpandWorkflow, "Link", each "https://make.powerautomate.com/environments/" & EnvironmentID & "/flows/" & [workflowidunique] & "/runs/" & [Flowrun]),
	AddEnvironmentName = Table.AddColumn(AddLink, "Environment", each EnvironmentName),
	ParseErrormessage = Table.TransformColumns(AddEnvironmentName,{{"errormessage", Json.Document}}),
	ExpandErrormessage = Table.ExpandRecordColumn(ParseErrormessage, "errormessage", {"message"}, {"message"})
in
	ExpandErrormessage

The first 3 steps of the M function require user input. The EnvironmentName is used at the AddEnvironmentName step to show the friendly name of the Environment. The EnvironmentID part is a little more interesting. This is used to compose a link to the failed flow run. This is helpful so we can quickly navigate to the failed flow run from this report.

If you view the failed flow run from Power Automate, it will redirect you to a URL with all the information in it. The URL for the example failed flow run is shown below.

https://make.powerautomate.com/environments/c660f971-ea48-4359-96ee-41c4f7f24831/solutions/~preferred/flows/f5413d4c-761c-7d41-74fe-e6a3768bfc10/runs/08584770252668929067875377492CU244

You can see it contains the base URL, the environment, the solution, the flow, and the run. Because I wanted to make it as generic as possible, I was struggling a bit with the solution part, as this can be different for different flows within this environment. I found out that you can just leave this out of the URL and then it still works. Sweet! The following URL also works.

https://make.powerautomate.com/environments/c660f971-ea48-4359-96ee-41c4f7f24831/flows/f5413d4c-761c-7d41-74fe-e6a3768bfc10/runs/08584770252668929067875377492CU244

That means we can compose the URL with 3 parameters. The first is something we did not retrieve from the OData feed, so that's why I specify it manually. You might be able to fetch this data from OData, but this is always the same, so performance wise it is best to keep it a manual input. I also renamed the name column to FlowRun, because we would otherwise end up with two name columns (because of the workflow name). The compose function is shown at the AddLink step.

"https://make.powerautomate.com/environments/" & EnvironmentID & "/flows/" & [workflowidunique] & "/runs/" & [Flowrun]

The end result is shown below.

This now has all the information we need. We can now copy the query for multiple environments, and append the tables into 1 table to finally create our report.

The steps after the OData function is similar for each environment. We learned about a function earlier, so we can make use of that knowledge to keep those steps manageable in one place.

Let's create a function called TransformFailedFlowRuns. Just like we said earlier, we have 3 input parameters. We know we need to specify the OData URL in the structure we created, so we make the ODataSource the first input parameter. The EnvironmentName and EnvironmentID are the second and third. You can also simply copy over all the steps we added after the OData step. You function will look like the M function shown below.

(ODataInput, EnvironmentName, EnvironmentID)=>
let
	RenameFlowrun = Table.RenameColumns(ODataInput,{{"name", "Flowrun"}}),
	ExpandWorkflow = Table.ExpandRecordColumn(RenameFlowrun, "Workflow", {"workflowidunique", "name"}, {"workflowidunique", "name"}),
	AddLink = Table.AddColumn(ExpandWorkflow, "Link", each "https://make.powerautomate.com/environments/" & EnvironmentID & "/flows/" & [workflowidunique] & "/runs/" & [Flowrun]),
	AddEnvironmentName = Table.AddColumn(AddLink, "Environment", each EnvironmentName),
	ParseErrormessage = Table.TransformColumns(AddEnvironmentName,{{"errormessage", Json.Document}}),
	ExpandErrormessage = Table.ExpandRecordColumn(ParseErrormessage, "errormessage", {"message"}, {"message"})
in
	ExpandErrormessage

If we now go back to the environment function, we can adjust it to the following.

let
	ODataSource = OData.Feed(
		"https://orge3e5e136.crm4.dynamics.com/api/data/v9.2/flowruns?$select=name,errormessage,status,endtime&$expand=Workflow($select=workflowidunique,name)&$filter=(status ne 'Succeeded' and endtime ge @endtimeInput)",
		null,
		[
			Implementation="2.0",
			Query = [
				#"@endtimeInput"= DateTime.ToText(DateTimeZone.RemoveZone(Date.AddDays(Date.StartOfDay(DateTimeZone.FixedUtcNow()), -3)), [Format="yyyy-MM-dd"])
			]
		]
	),
	TransformedData = TransformFailedFlowRuns(ODataSource, "Miguel Verweij's Environment", "c660f971-ea48-4359-96ee-41c4f7f24831")
in
	TransformedData

We now basically have two steps. We get the information from Dataverse using OData with dynamic Query parameters, and then transform that data in a concise way using our own function.

You can now duplicate the Environment query and adjust the URL, Environment Name, and Environment ID, and the Query name itself and you fetch the failed flow runs from another environment. This is pretty scalable and easy to manage, as we manage the transformation in our function. If you have created multiple environment queries, you can append the queries as new to get an overview table. I named this query FailedFlowRuns

If we now Save and close the Power Query editor, you are able to publish your report. I know it is still empty, but the main goal for now is to see that we can actually use the scheduled refresh functionality.

Setting up Power BI service

We now move back over to the Power BI service. You should not get annoying errors anymore.

You will see that you need to enter credentials for the 3 OData sources. The connection you create in Desktop is not ported over to the online Power BI service, but you can use the account you need to connect to. In my scenario this will be the service account. Make sure to select the Skip test connection checkbox.

Once fixed all OData connections, you can setup your scheduled refresh. This is as simple as selecting your time zone and the time you want the dataset to refresh.

Create a Simple report

I have to tell right away that I didn't spend too much time making a fancy report. I only need to see the failed flow runs per day. If you create something stunning, please let me know😉.

First I add a multi-row card to show how many failed flows there are per day to give an overview. I add the Flowrun from the FailedFlowRuns table and the Day of the endtime table. This is a data hierarchy, so it directly groups it per day. I make sure the Flowrun is set to Count (to show how many failed flow runs there are per day) and I renamed it.

The visual will be as following. Make sure you sort it descending and based sort by the Day field. This might not be super fancy when a ne month starts (31st on top, then 2, 1) but that works for me.

Then I want to add a table to see which items there are. I want to show the Environment, Flow name, status, the message, endtime, and the link to the failed flow run. You need to adjust the endtime to not get the date hierarchy. It will look something like the image below.

We want the URL to be clickable. To do that, we need to go to Table view (second button on the left navigation bar), select the link column, and set the Data category to Web URL.

Now when you return to the report, you see it as a hyperlink. To get a small link icon, you can go to format your visual (make sure to select the table visual), search for URL icon and toggle the values on. It is now a bit easier to use.

You can now easily see all the failed flow runs, filter by day, and directly navigate to the failed flow run to see what is going on. Just a new publish action and it is available online. You can add the report to your Teams channel if you want to.

Additional Information

Microsoft's also has Automation Center in preview. This is also a great starting point to improve your monitoring actions. This gives a more high level overview, and I think that being able to query the exact data that you are interested in can add value to you, also as a developer (rather than an admin) to keep your solutions up and running.

Key Takeaways

👉🏻 Elastic Tables require OData to fetch the data

👉🏻 Base URL is required for Scheduled Refreshes

👉🏻 Custom Functions can keep things manageable

Comments


bottom of page