Resilience Architecture — AWS SysOps Administrator (SOA-C03)
Failover Routing and Data Consistency Are Different Layers
Route 53 health checks and failover routing can redirect traffic to a secondary region within seconds. Candidates select this as a complete resilience answer. But if Aurora Global Database replication lag is nonzero at the moment of failover, the secondary region may serve stale reads or require manual promotion before it can accept writes. Traffic routing and data consistency are two different layers — the exam tests whether you've addressed both, not just the DNS path.
What This Pattern Tests
The exam gives availability requirements and tests whether you design the right resilience tier. Multi-AZ deployments (RDS Multi-AZ, ECS across AZs, ALB cross-zone) protect against single AZ failure — sufficient for 99.9% to 99.99% SLAs. Multi-Region with Route 53 failover protects against regional failures — needed for 99.999% SLAs. Cell-based architecture with shuffle sharding limits blast radius for individual customer failures. The trap is designing multi-region for a 99.9% SLA (over-provisioning) or single-AZ for a 99.99% SLA (under-provisioning). Aurora Global Database replicates across regions with <1s lag — but only needed when the SLA demands regional failover.
Decision Axis
SLA target maps to resilience tier. 99.9% = Multi-AZ. 99.99% = Multi-AZ with auto-scaling. 99.999% = Multi-Region active-active.
Associated Traps
More Top Traps on This Exam
Decision Rules
Whether the event-delivery mechanism provides visibility-timeout-based automatic retry on consumer failure (SQS with DLQ) or only a one-shot push that discards on failure (SNS direct invocation).
Whether to buffer S3 events in an SQS queue with visibility-timeout retry semantics and a Dead-Letter Queue, or to route them through SNS push delivery, given that the dominant constraint is guaranteed at-least-once processing even when the Lambda consumer crashes mid-execution.
Idempotency must be enforced at the storage-write layer using a DynamoDB conditional expression tied to a client-provided idempotency key, not at the queue layer using SQS FIFO deduplication, because FIFO deduplication suppresses duplicates only within its fixed five-minute window and cannot guard against a write that reaches DynamoDB before a Lambda timeout causes the same message to be redelivered minutes later.
Domain Coverage
Difficulty Breakdown