Skip to main content
Version: 2.4.0

Observe Workflow


Before you begin#

You must schedule a workflow. To know more about scheduling workflows click here


After scheduling a workflow, you can track the status of the workflow run from the Runs tab in the Litmus Workflow. The status that is currently displayed are:

  • Failed
  • Running
  • Completed
Workflow Runs Table showing a Running WorkflowWorkflow Runs Table showing a Running Workflow

you can analyze a workflow using two methods:

Visualize the workflow run graph#

After scheduling a workflow, you can click on the Show the workflow option or click on the workflow name to see the real-time graph of the workflow.

Workflow Runs Graph of Podtato Head workflowGraph of Podtato Head workflow

The graph consists of useful information such as :

  • Phase of individual nodes.
  • Total time taken for the nodes to execute.
  • Structure of the experiments (Serial or Parallel experiments).

You can also visualize the non Chaos workflows. The logs of individual nodes are also available here.

Workflow run graph of a non chaos workflowGraph of a non Chaos Workflow

View logs of individual nodes#

you can click on the nodes to get the logs of that particular step. If the revert-chaos step is disabled, the complete logs are available which include the runner pod logs and the chaos logs.

Workflow Runs Podtato Head workflow with LogsPodtato Head workflow with Logs

View chaos results#

Once the experiment completes, the Chaos Results are also available alongside the logs. The Chaos Results are directly fetched from the ChaosResult CRD.

Podtato Head workflow with chaos logs and chaos result of pod-delete experimentPodtato Head workflow with chaos logs and chaos result of generic/pod-delete experiment

Resilience Score Calculation#

A Resilience Score is the measure of how resilient your workflow run is considering all the chaos experiments and their individual result points. This calculation takes into account the individual experiment weights (from a range of 1-10) which are relative to each other.

Once a weight has been assigned to the experiment, we look for the Probe Success Percentage for that experiment itself (Post Chaos) and calculate the total resilience result for that experiment as a multiplication of the weight given and the probe success percentage returned after the Chaos Run.

Total Resilience for one single experiment = (Weight Given to that experiment * Probe Success Percentage)

If an experiment doesn't have a probe in it, the probe success percentage returned can either be 0 or 100 based on the experiment verdict. If the experiment passed then it returns 100 else 0.

The Final Resilience Score is calculated by dividing the total test result by the sum of all the weights of all the experiments combined in a single workflow.

For example, if we consider two experiments in a workflow, here is what the calculation would look like.

Considering Probe Success Percentage is 100

ExperimentWeightProbe Success PercentageTotal Test Result
exp13100(3 * 100) = 300
exp29100(9 * 100) = 900
Weight Sum = 3 + 9 = 12Total Test Result = 300 + 900 = 1200
Resilience Score = Total Test Result / Weight Sum                 = 1200 / 12                 = 100%

Analytics from the runs tab#

Once the workflow run execution completes, you can click the Show the analytics option in the Runs tab of Litmus Workflows which opens up a Workflow Dashboard which can also be accessed from the Analytics section and is explained more here. This analytics can be crucial to analyse the Cron Workflows.

Resources#

Learn more#