Performance Optmiziation Concepts with Various AWS Services

Autoscaling in AWS
Offloading and Load Distribution
Embracing Serverless Architectures
Summary

Welcome, Solutions Architects. This final installment of the “Designing for Performance” series delves into tuning performance across AWS services. In this lesson, we explore autoscaling, offloading, and serverless architectures to enhance performance as defined by the AWS Well-Architected Framework.

Autoscaling in AWS

Autoscaling automatically adjusts resource capacity based on demand. While it is commonly associated with EC2—using scaling groups to add or remove servers based on load—AWS extends this capability to many other services. For example, DynamoDB adjusts capacity behind the scenes, with changes reflected in performance metrics rather than visible infrastructure modifications.

The image is a diagram comparing autoscaling and manual scaling in AWS, showing an architecture with users, an application load balancer, and an auto-scaling group across multiple availability zones. It also includes text explaining autoscaling benefits.

Keep in mind that numerous AWS hosting services, including EC2, ECS, Lambda, SageMaker, and Spot Fleet, support autoscaling. Additionally, managed databases like Aurora, DynamoDB, ElastiCache, Keyspaces, Neptune, and managed Kafka services are designed to scale automatically. When answering exam questions that compare manual scaling with autoscaling, choose the option where AWS supports automated scaling.

The image is a table comparing AWS services in terms of console, CLI, SDK, and CloudFormation access, highlighting their support for autoscaling and manual scaling. It includes a note about AWS applications supporting automated scaling and provisioning.

Offloading and Load Distribution

Offloading shifts tasks to auxiliary systems, reducing the processing load on a primary server. A common approach pairs offloading with load balancers. Instead of routing all traffic to a single web server, Elastic Load Balancing (ELB) distributes incoming requests. In some setups, an NGINX proxy layer routes traffic before it reaches the servers, and services like S3 handle static content so that web servers can focus on dynamic content.

The image is a diagram illustrating a network architecture for performance optimization, featuring EC2 instances, NGINX proxies, and Elastic Load Balancing for distributing loads across different brands.

Offloading is equally critical in database architectures. Traditional designs often rely on a primary instance for both reads and writes. Modern architectures frequently use read replicas to manage the bulk of read traffic—commonly an 80/20 read-to-write ratio. For instance, in an Aurora setup, the primary instance handles writes while one or more replicas serve read requests.

The image illustrates the architecture of an Amazon Aurora DB Cluster, showing a primary instance and multiple Aurora replicas across different availability zones, with data copies for offloading and load distribution.

Similarly, Amazon ElastiCache supports cross-region replication. In a typical configuration, a primary node manages writes (and some reads), while standby or replica nodes handle additional read traffic. In the event of a failure, the endpoint used for writes is updated, typically without impacting reads.

The image illustrates a design for performance involving offloading and load distribution using Amazon ElastiCache. It shows a primary (active) region with an active master and replicas, and a secondary (passive) region with a passive master and replicas, connected via an Amazon cross-region link.

Amazon DocumentDB follows a similar pattern by supporting instance replicas that help redirect read traffic, just like Aurora.

The image illustrates a design for performance focusing on offloading and load distribution in Amazon DocumentDB, showing a primary instance and replica instances across different availability zones with distributed storage volumes.

Beyond databases, offloading enhances the performance of distributed caches and content delivery networks. For example:

API Gateway: Offers caching to improve response times for read-only endpoints.
CloudFront: Distributes static content across hundreds of edge locations worldwide, decreasing the load on the origin servers.

The image is a diagram illustrating the concept of offloading and load distribution using Amazon API Gateway, showing how mobile apps, websites, and services connect through the gateway to various AWS services and endpoints. It emphasizes the role of caching beyond databases.

Offloading improves performance by distributing workloads efficiently across various systems, ensuring that primary servers remain unburdened.

Embracing Serverless Architectures

Transitioning from a server-based model to a serverless architecture can dramatically enhance performance by automating scaling at granular levels. AWS serverless services—such as Lambda, Fargate, serverless Aurora, serverless Redshift, and serverless EMR—reduce the overhead associated with server provisioning and autoscaling. Typically, serverless options offer superior cost efficiency, enhanced performance, and reduced operational complexity compared to traditional server-based setups.

The image compares servered and serverless architectures, illustrating AWS components and availability zones, with a note encouraging the use of serverless for better scalability.

Serverless architectures deliver benefits including automatic scaling, minimized management effort, and improved resource utilization.

Summary

This lesson covered three essential AWS performance improvement concepts:

Autoscaling: AWS services such as EC2, DynamoDB, Aurora, DocumentDB, Managed Kafka, and Lambda offer robust autoscaling, which is crucial for maintaining optimal performance.
Offloading and Load Distribution: By delegating tasks like read operations to replicas and caches, offloading reduces the load on primary systems across both application and database layers.
Serverless Architectures: Employing serverless solutions like AWS Lambda and serverless database services minimizes operational overhead while enhancing scalability and performance.

When preparing for the AWS exam or addressing real-world scenarios, remember to prioritize solutions that incorporate autoscaling, offloading, and serverless architectures, as AWS favors these approaches for enhanced performance and operational efficiency. I’m Michael Forrester. Feel free to join the discussions in the forums or connect on Slack or Discord. I look forward to engaging with you in the next lesson.

Watch Video

Foundational Areas of Performance Optimization on AWS

Agenda and Introduction to Cost Optimization

⌘I

Introduction

Services Networking

Services Storage

Services Compute

Services Database

Services Application Integration

Services Data and ML

Services Migration and Transfer

Services Management and Governance

Services Security

Bringing it all together

Designing for Security

Designing for Reliability

Designing for Performance

Designing for Cost Optimization

Applying your Design Skills

Performance Optmiziation Concepts with Various AWS Services

Autoscaling in AWS

Offloading and Load Distribution

Embracing Serverless Architectures

Summary

Watch Video

Introduction

Services Networking

Services Storage

Services Compute

Services Database

Services Application Integration

Services Data and ML

Services Migration and Transfer

Services Management and Governance

Services Security

Bringing it all together

Designing for Security

Designing for Reliability

Designing for Performance

Designing for Cost Optimization

Applying your Design Skills

​Autoscaling in AWS

​Offloading and Load Distribution

​Embracing Serverless Architectures

​Summary

Watch Video

Autoscaling in AWS

Offloading and Load Distribution

Embracing Serverless Architectures

Summary