How to Plan Your Experiment Part 2

6. Create Your Hypothesis
7. Design the Experiment
8. Run the Experiment
9. Conduct a Post-Mortem
Links and References

In Part 1, you defined objectives, selected workloads, and established a performance baseline. Now, we’ll guide you through the final steps—hypothesis creation, experiment design, execution, and analysis—so you can confidently run your game day or Fault Injection Simulation (FIS) experiment.

6. Create Your Hypothesis

A well-defined hypothesis clarifies what you expect to happen when a fault is injected. To formulate it:

Identify the affected components
Pinpoint services, instances, or containers targeted by your fault injection.
Describe the expected behavior
Determine how your application should respond under fault conditions.
Define success metrics
Choose key indicators—latency, error rate, throughput—to validate resilience.

A precise hypothesis narrows your experiment’s scope and sets clear success criteria.

7. Design the Experiment

Use AWS FIS to control scope, duration, and safety checks. Configure the following:

Configuration	Description	Example
Target Resources	Apply tags to focus your fault injection on specific AWS resources.	Tag EC2 instances with `env=staging`.
Duration	Specify how long the fault remains active before auto rollback.	`PT5M` (5 minutes)
Stop Conditions	Define thresholds to abort the experiment if they’re violated.	CPU > 80% for 2 minutes

These settings help you limit blast radius and maintain control throughout your test.

8. Run the Experiment

Start in lower environments
Validate your hypothesis in development or staging before touching production.

Always begin in a non-production account or VPC to avoid unintended impact.

Validate resilience
Monitor your application as the fault is injected. Check dashboards and alerts to ensure behavior aligns with your hypothesis.
Promote to production
Once confirmed, rerun the experiment against production workloads with the same configuration.
Mark success
A successful run demonstrates that your architecture can withstand the injected fault without violating SLAs.

9. Conduct a Post-Mortem

A structured post-mortem transforms insights into improvements:

Step	Action
Analyze Impact	Review logs, metrics, traces, and user experience during the experiment.
Blameless Review	Host a session focused on learning, not finger-pointing.
Document Findings	Update runbooks, architecture diagrams, and automation scripts based on lessons learned.
CI/CD Integration	Automate FIS experiments in your CI/CD pipeline to continuously validate resilience.

Maintain a blameless culture in your post-mortems to encourage transparent learning and innovation.

The image outlines an eight-step process for planning an experiment, including defining objectives, choosing workloads, and conducting a postmortem. It also highlights the importance of analyzing impact and addressing issues.

Links and References

Watch Video

How to Plan Your Experiment Part 1

Establishing Steady State Metrics Using Cloudwatch RUMX Ray

⌘I

Introduction

Chaos Engineering Fundamentals

Building a Basic FIS experiment

Introduction to Real life Application

Chaos Engineering on Database Aurora

Chaos Engineering on Serverless Fargate

Chaos Engineering on Kubernetes EKS

Chaos Engineering on Availability Zone

Conclusion

Chaos Engineering on Compute E C2

How to Plan Your Experiment Part 2

6. Create Your Hypothesis

7. Design the Experiment

8. Run the Experiment

9. Conduct a Post-Mortem

Links and References

Watch Video

Introduction

Chaos Engineering Fundamentals

Building a Basic FIS experiment

Introduction to Real life Application

Chaos Engineering on Database Aurora

Chaos Engineering on Serverless Fargate

Chaos Engineering on Kubernetes EKS

Chaos Engineering on Availability Zone

Conclusion

Chaos Engineering on Compute E C2

​6. Create Your Hypothesis

​7. Design the Experiment

​8. Run the Experiment

​9. Conduct a Post-Mortem

​Links and References

Watch Video

6. Create Your Hypothesis

7. Design the Experiment

8. Run the Experiment

9. Conduct a Post-Mortem

Links and References