Skip to main content
Version: 2.0.0

ChaosEngine


The ChaosEngine CR is the main user-facing chaos custom resource with a namespace scope and is designed to hold information around how the chaos experiments are executed. It connects an application instance with one or more chaos experiments, while allowing the users to specify run level details (override experiment defaults, provide new environment variables and volumes, options to delete or retain experiment pods, etc.,). This CR is also updated/patched with status of the chaos experiments, making it the single source of truth with respect to the chaos.

Prerequisites#

To understand the concepts of ChaosEngine better make sure you are aware of the Chaos Experiment Custom Resources

ChaosEngine#

State Specification#

This section describes the fields in the ChaosEngine spec and the possible values that can be set against the same.

Field.spec.engineState
DescriptionFlag to control the state of the chaosengine
TypeMandatory
Rangeactive, stop
Defaultactive
NotesThe engineState in the spec is a user defined flag to trigger chaos. Setting it to active ensures successful execution of chaos. Patching it with stop aborts ongoing experiments. It has a corresponding flag in the chaosengine status field, called engineStatus which is updated by the controller based on actual state of the ChaosEngine.

Application Specification#

Field.spec.appinfo.appns
DescriptionFlag to specify namespace of application under test
TypeOptional
Rangeuser-defined (type: string)
Defaultn/a
NotesThe appns in the spec specifies the namespace of the AUT. Usually provided as a quoted string. It is optional for the infra chaos.
Field.spec.appinfo.applabel
DescriptionFlag to specify unique label of application under test
TypeOptional
Rangeuser-defined (type: string)(pattern: "label_key=label_value")
Defaultn/a
NotesThe applabel in the spec specifies a unique label of the AUT. Usually provided as a quoted string of pattern key=value. Note that if multiple applications share the same label within a given namespace, the AUT is filtered based on the presence of the chaos annotation litmuschaos.io/chaos: "true". If, however, the annotationCheck is disabled, then a random application (pod) sharing the specified label is selected for chaos. It is optional for the infra chaos.
Field.spec.appinfo.appkind
DescriptionFlag to specify resource kind of application under test
TypeOptional
Rangedeployment, statefulset, daemonset, deploymentconfig, rollout
Defaultn/a (depends on app type)
NotesThe appkind in the spec specifies the Kubernetes resource type of the app deployment. The Litmus ChaosOperator supports chaos on deployments, statefulsets and daemonsets. Application health check routines are dependent on the resource types, in case of some experiments. It is optional for the infra chaos
Field.spec.auxiliaryAppInfo
DescriptionFlag to specify one or more app namespace-label pairs whose health is also monitored as part of the chaos experiment, in addition to a primary application specified in the .spec.appInfo. NOTE: If the auxiliary applications are deployed in namespaces other than the AUT, ensure that the chaosServiceAccount is bound to a cluster role and has adequate permissions to list pods on other namespaces.
TypeOptional
Rangeuser-defined (type: string)(pattern: "namespace:label_key=label_value").
Defaultn/a
NotesThe auxiliaryAppInfo in the spec specifies a (comma-separated) list of namespace-label pairs for downstream (dependent) apps of the primary app specified in .spec.appInfo in case of pod-level chaos experiments. In case of infra-level chaos experiments, this flag specifies those apps that may be directly impacted by chaos and upon which health checks are necessary.

Note: Irrespective of the nature of the chaos experiment, i.e., pod-level (single-app impact/lesser blast radius) or infra-level(multi-app impact/higher blast radius), the .spec.appinfo is a must-fill where the experiment is pointed to at least one primary app whose health is measured as an indicator of the resiliency / success of the chaos experiment.

RBAC Specification#

Field.spec.chaosServiceAccount
DescriptionFlag to specify serviceaccount used for chaos experiment
TypeMandatory
Rangeuser-defined (type: string)
Defaultn/a
NotesThe chaosServiceAccount in the spec specifies the name of the serviceaccount mapped to a role/clusterRole with enough permissions to execute the desired chaos experiment. The minimum permissions needed for any given experiment is provided in the .spec.definition.permissions field of the respective chaosexperiment CR.

Runtime Specification#

Field.spec.annotationCheck
DescriptionFlag to control annotationChecks on applications as prerequisites for chaos
TypeOptional
Rangetrue, false
Defaulttrue
NotesThe annotationCheck in the spec controls whether or not the operator checks for the annotation "litmuschaos.io/chaos" to be set against the application under test (AUT). Setting it to true ensures the check is performed, with chaos being skipped if the app is not annotated, while setting it to false suppresses this check and proceeds with chaos injection.
Field.spec.terminationGracePeriodSeconds
DescriptionFlag to control terminationGracePeriodSeconds for the chaos pods(abort case)
TypeOptional
Rangeinteger value
Default30
NotesThe terminationGracePeriodSeconds in the spec controls the terminationGracePeriodSeconds for the chaos resources in abort case. Chaos pods contains chaos revert upon abortion steps, which continuously looking for the termination signals. The terminationGracePeriodSeconds should be provided in such a way that the chaos pods got enough time for the revert before completely terminated.
Field.spec.jobCleanupPolicy
DescriptionFlag to control cleanup of chaos experiment job post execution of chaos
TypeOptional
Rangedelete, retain
Defaultdelete
NotesThe jobCleanupPolicy controls whether or not the experiment pods are removed once execution completes. Set to retain for debug purposes (in the absence of standard logging mechanisms).

Component Specification#

Field.spec.components.runner.image
DescriptionFlag to specify image of ChaosRunner pod
TypeOptional
Rangeuser-defined (type: string)
Defaultn/a (refer Notes)
NotesThe .components.runner.image allows developers to specify their own debug runner images. Defaults for the runner image can be enforced via the operator env CHAOS_RUNNER_IMAGE
Field.spec.components.runner.imagePullPolicy
DescriptionFlag to specify imagePullPolicy for the ChaosRunner
TypeOptional
RangeAlways, IfNotPresent
DefaultIfNotPresent
NotesThe .components.runner.imagePullPolicy allows developers to specify the pull policy for chaos-runner. Set to Always during debug/test.
Field.spec.components.runner.imagePullSecrets
DescriptionFlag to specify imagePullSecrets for the ChaosRunner
TypeOptional
Rangeuser-defined (type: []corev1.LocalObjectReference)
Defaultn/a
NotesThe .components.runner.imagePullSecrets allows developers to specify the imagePullSecret name for ChaosRunner.
Field.spec.components.runner.runnerAnnotations
DescriptionAnnotations that needs to be provided in the pod which will be created (runner-pod)
TypeOptional
Range user-defined (type: map[string]string)
Default n/a
NotesThe .components.runner.runnerAnnotation allows developers to specify the custom annotations for the runner pod.
Field.spec.components.runner.args
DescriptionSpecify the args for the ChaosRunner Pod
TypeOptional
Rangeuser-defined (type: []string)
Defaultn/a
NotesThe .components.runner.args allows developers to specify their own debug runner args.
Field.spec.components.runner.command
DescriptionSpecify the commands for the ChaosRunner Pod
TypeOptional
Rangeuser-defined (type: []string)
Defaultn/a
NotesThe .components.runner.command allows developers to specify their own debug runner commands.
Field.spec.components.runner.configMaps
DescriptionConfigmaps passed to the chaos runner pod
TypeOptional
Rangeuser-defined (type: {name: string, mountPath: string})
Defaultn/a
NotesThe .spec.components.runner.configMaps provides for a means to insert config information into the runner pod.
Field.spec.components.runner.secrets
DescriptionKubernetes secrets passed to the chaos runner pod.
TypeOptional
Rangeuser-defined (type: {name: string, mountPath: string})
Defaultn/a
NotesThe .spec.components.runner.secrets provides for a means to push secrets (typically project ids, access credentials etc.,) into the chaos runner pod. These are especially useful in case of platform-level/infra-level chaos experiments.
Field.spec.components.runner.nodeSelector
DescriptionNode selectors for the runner pod
TypeOptional
RangeLabels in the from of label key=value
Defaultn/a
NotesThe .spec.components.runner.nodeSelector The nodeselector contains labels of the node on which runner pod should be scheduled. Typically used in case of infra/node level chaos.
Field.spec.components.runner.resources
DescriptionSpecify the resource requirements for the ChaosRunner pod
TypeOptional
Rangeuser-defined (type: corev1.ResourceRequirements)
Defaultn/a
NotesThe .spec.components.runner.resources contains the resource requirements for the ChaosRunner Pod, where we can provide resource requests and limits for the pod.
Field.spec.components.runner.tolerations
DescriptionToleration for the runner pod
TypeOptional
Rangeuser-defined (type: []corev1.Toleration)
Defaultn/a
NotesThe .spec.components.runner.tolerations Provides tolerations for the runner pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos.

Experiment Specification#

Field.spec.experiments[].name
DescriptionName of the chaos experiment CR
TypeMandatory
Rangeuser-defined (type: string)
Defaultn/a
NotesThe experiment[].name specifies the chaos experiment to be executed by the ChaosOperator.
Field.spec.experiments[].spec.components.env
DescriptionEnvironment variables passed to the chaos experiment
TypeOptional
Rangeuser-defined (type: {name: string, value: string})
Defaultn/a
NotesThe experiment[].spec.components.env specifies the array of tunables passed to the experiment pods. Though the field is optional from a chaosengine definition viewpoint, it is almost always necessary to provide experiment tunables via this definition. While some of the env variables override the defaults in the experiment CR and some of the env are mandatory additions filling in for placeholders/empty values in the experimet CR. For a list of "mandatory" & "optional" env for an experiment, refer to the respective experiment documentation.
Field.spec.experiments[].spec.components.configMaps
DescriptionConfigmaps passed to the chaos experiment
TypeOptional
Rangeuser-defined (type: {name: string, mountPath: string})
Defaultn/a
NotesThe experiment[].spec.components.configMaps provides for a means to insert config information into the experiment. The configmaps definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.
Field.spec.experiments[].spec.components.secrets
DescriptionKubernetes secrets passed to the chaos experiment
TypeOptional
Rangeuser-defined (type: {name: string, mountPath: string})
Defaultn/a
NotesThe experiment[].spec.components.secrets provides for a means to push secrets (typically project ids, access credentials etc.,) into the experiment pods. These are especially useful in case of platform-level/infra-level chaos experiments. The secrets definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.
Field.spec.experiments[].spec.components.experimentImage
DescriptionOverride the image of the chaos experiment
TypeOptional
Range string
Defaultn/a
NotesThe experiment[].spec.components.experimentImage overrides the experiment image for the chaoexperiment.
Field.spec.experiments[].spec.components.experimentImagePullSecrets
DescriptionFlag to specify imagePullSecrets for the ChaosExperiment
TypeOptional
Rangeuser-defined (type: []corev1.LocalObjectReference)
Defaultn/a
NotesThe .components.runner.experimentImagePullSecrets allows developers to specify the imagePullSecret name for ChaosExperiment.
Field.spec.experiments[].spec.components.nodeSelector
DescriptionProvide the node selector for the experiment pod
TypeOptional
Range Labels in the from of label key=value
Defaultn/a
NotesThe experiment[].spec.components.nodeSelector The nodeselector contains labels of the node on which experiment pod should be scheduled. Typically used in case of infra/node level chaos.
Field.spec.experiments[].spec.components.statusCheckTimeouts
DescriptionProvides the timeout and retry values for the status checks. Defaults to 180s & 90 retries (2s per retry)
TypeOptional
Range It contains values in the form delay: int, timeout: int
Defaultdelay: 2s and timeout: 180s
NotesThe experiment[].spec.components.statusCheckTimeouts The statusCheckTimeouts override the status timeouts inside chaosexperiments. It contains timeout & delay in seconds.
Field.spec.experiments[].spec.components.resources
DescriptionSpecify the resource requirements for the ChaosExperiment pod
TypeOptional
Rangeuser-defined (type: corev1.ResourceRequirements)
Defaultn/a
NotesThe experiment[].spec.components.resources contains the resource requirements for the ChaosExperiment Pod, where we can provide resource requests and limits for the pod.
Field.spec.experiments[].spec.components.experimentAnnotations
DescriptionAnnotations that needs to be provided in the pod which will be created (experiment-pod)
TypeOptional
Range user-defined (type: label key=value)
Default n/a
NotesThe .spec.components.experimentAnnotation allows developers to specify the custom annotations for the experiment pod.
Field.spec.experiments[].spec.components.tolerations
DescriptionToleration for the experiment pod
TypeOptional
Rangeuser-defined (type: []corev1.Toleration)
Defaultn/a
NotesThe .spec.components.tolerationsTolerations for the experiment pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos.
Field.spec.experiments[].spec.probe
Description Declarative way to define the chaos hypothesis
TypeOptional
Range user-defined
Default n/a
NotesThe .probe allows developers to specify the chaos hypothesis. It supports four types: cmdProbe, k8sProbe, httpProbe, promProbe. For more details refer

Summary#

The ChaosEngine CR is the user-facing CR which helps in binding the application instance with the ChaosExperiment. It defines the Run Policies and also holds the status of your experiment. This CR helps you customize the experiment according to your need since it can override some of the default characteristics/tunables in your experiment CR.

This CR is also updated/patched with status of the chaos experiments, making it the single source of truth with respect to the chaos.

Resources#

Learn More#