Operational Complexity Underestimation — AWS Solutions Architect (SAA-C03)
The answer is correct but operationally expensive. The exam prefers managed services over self-managed when both meet functional requirements.
Flexible architectures carry hidden management weight
Look for wording like "small team," "minimal operational overhead," or "no custom code." These are signals that the exam is filtering for managed services and low-touch patterns. Candidates select distributed or self-managed options because they look powerful or flexible. The exam uses operational burden as a constraint, not a footnote — every added component, whether a NAT Gateway, a self-managed broker, or a custom orchestration layer, adds a failure mode the scenario explicitly refuses to carry.
The Scenario
A team of 3 developers needs to run a containerized application with auto-scaling. You recommend Kubernetes on EC2 with kops for cluster management. The correct answer is ECS on Fargate. The scenario said "small team" and "minimize operational burden." Self-managed Kubernetes requires managing the control plane (etcd backups, API server upgrades, certificate rotation), node group updates, CNI plugin configuration, and ingress controller maintenance. ECS on Fargate eliminates all of that — AWS manages compute, scaling, and patching. The trade-off is less customization, but the scenario never asked for Kubernetes-specific features like custom operators or CRDs.
How to Spot It
- •"Minimize operational overhead," "small team," "reduce management burden" — these phrases are signals to choose the most managed option. ECS Fargate over EKS self-managed nodes. Aurora over self-managed PostgreSQL on EC2. Lambda over always-on containers for event-driven workloads.
- •EKS managed node groups reduce operational burden compared to self-managed nodes, but you still manage node AMI updates, pod scaling, and cluster upgrades. EKS with Fargate eliminates node management entirely but loses DaemonSet support and some storage options. The exam tests these operational trade-offs at each level.
- •Self-managed options (EC2, EKS self-managed, self-hosted databases, self-managed Kafka) are only correct when the scenario explicitly requires a capability that managed services cannot provide — custom kernel modules, specific OS versions, or unsupported database engines.
Decision Rules
Whether to configure the Auto Scaling group health check type as ELB rather than EC2, so that application-layer failures detected by the load balancer trigger automatic instance replacement rather than leaving degraded instances in service.
When data volume is large, transformation logic is custom and iterative, and a hard SLA window is specified, prefer a cluster platform that exposes per-executor Spark configuration over a serverless ETL service that abstracts resource allocation.
Whether to grant the EC2 fleet an IAM instance profile so STS issues automatically rotated temporary credentials, or embed long-term IAM user access keys in the launch template, trading away key-rotation manageability for a superficially simpler initial setup.
Choose EC2-based compute over Lambda when the batch job execution window exceeds Lambda's 15-minute hard timeout, because selecting Lambda forces job decomposition and orchestration retry logic that adds more operational complexity than right-sizing EC2 Spot instances with Auto Scaling.
At sustained thousands-of-requests-per-second throughput running 24 hours a day, does Lambda's per-invocation plus GB-second billing model or Fargate's vCPU and memory-hour billing model yield a lower total monthly compute cost, and which option avoids adding concurrency and timeout management complexity?
Whether sustained 24/7 high-throughput with no idle window tips the cost-optimal choice from Lambda per-invocation billing to Fargate per-vCPU/memory-hour billing, given that Lambda's cost model inverts when concurrency is perpetually high and provisioned-concurrency configuration adds operational management burden.
Whether the workload's low average invocation frequency and extended idle windows make Lambda's per-invocation model cheaper than Fargate's vCPU/memory-hour model, where idle time directly accrues cost unless scale-to-zero is configured—adding operational complexity the scenario does not justify.
Domain Coverage
Difficulty Breakdown
Related Patterns