Components of the FIS Experiment
| Component | Description |
|---|---|
| Given | We have an application running on EC2 instances spread across multiple Availability Zones, all managed by an Auto Scaling Group. |
| Hypothesis | If we terminate one EC2 instance, the Auto Scaling Group will launch a new instance, and the application will continue serving traffic without interruption. |
We use AWS Fault Injection Simulator to safely inject failures and test application resilience. Make sure your IAM role has the required permissions to execute FIS experiments.
Experiment Steps
- Create FIS Experiment Template
Define the target resources (the ASG) and select theaws:ec2:terminate-instancesaction. - Specify Targets and Actions
- Target: EC2 instances belonging to your ASG
- Action: Terminate one randomly selected instance
- Set Stop Conditions
Monitor CloudWatch alarms (e.g., high error rates or latency). If any alarm triggers, FIS will automatically stop the experiment. - Run and Observe
Execute the FIS experiment and watch the ASG replace the terminated instance. - Validate Outcome
Confirm that the new EC2 instance passes health checks and that no user-facing errors occur.
Always run chaos experiments in a staging or non-production environment first. Verify that your CloudWatch alarms and Auto Scaling health checks are correctly configured to avoid unintended downtime.