- Identify weaknesses before they affect customers
- Validate auto-scaling, failover, and recovery processes
- Ensure SLAs are met under adverse conditions
AWS FIS supports both simple and complex scenarios—from terminating individual EC2 instances to simulating an Availability Zone outage.
Key Benefits
| Benefit | Description |
|---|---|
| Fully Managed | No need to provision infrastructure for fault injection |
| Native AWS Integration | Works with IAM, CloudWatch Alarms, AWS X-Ray, EventBridge, and more |
| Prebuilt & Customizable | Use built-in templates or define your own experiments |
| Multi-Environment Support | Inject faults in EC2, ECS, EKS, RDS, Lambda, and more |
AWS FIS Architecture
AWS FIS integrates seamlessly with your AWS environment:- CloudWatch Alarms: Trigger experiments or remediation workflows when thresholds are crossed.
- AWS X-Ray: Correlate faults with distributed traces to pinpoint failures.
- EventBridge: Automate experiment scheduling and notifications.
You can leverage AWS FIS to simulate:
- EC2 instance terminations and CPU/network stress
- ECS and EKS pod failures
- RDS instance failovers
- Availability Zone outages
- Network latency and packet loss between resources
Managing Experiments
AWS FIS experiments are defined as JSON documents. You can manage them through:| Interface | Command / Action |
|---|---|
| AWS Management Console | Create, configure, and run experiments via the web UI |
| AWS CLI | aws fis create-experiment-template |
| AWS CloudFormation | Use the AWS::FIS::ExperimentTemplate resource |
Always run experiments in a staging or non-production environment first. Fault injection can cause service interruptions!
Security & Permissions
Leverage AWS Identity and Access Management (IAM) to grant granular permissions:fis:CreateExperimentTemplatefis:StartExperimentfis:StopExperimentcloudwatch:DescribeAlarmsec2:TerminateInstances