AWS · SAP-C02

Operational Complexity Underestimation — AWS Solutions Architect Pro (SAP-C02)

The answer is correct but operationally expensive. The exam prefers managed services over self-managed when both meet functional requirements.

Small Ops Team Wording Rules Out Custom Orchestration

Custom EC2-based orchestration with hand-rolled retry logic and Lambda fan-out is technically valid. The trap is ignoring the operational constraint embedded in the scenario setup. Phrases like 'small platform team,' 'no dedicated DevOps engineers,' or 'reduce operational burden' reframe the decision entirely. An architecture that requires ongoing maintenance, custom state tracking, and bespoke failure handling fails those constraints regardless of its technical capability. AWS Step Functions or EventBridge Pipes replacing custom coordination is the answer when the scenario explicitly limits the team size available to run and support the system.

39%of exam questions affected (77 of 200)

The Scenario

A team of 3 developers needs to run a containerized application with auto-scaling. You recommend Kubernetes on EC2 with kops for cluster management. The correct answer is ECS on Fargate. The scenario said "small team" and "minimize operational burden." Self-managed Kubernetes requires managing the control plane (etcd backups, API server upgrades, certificate rotation), node group updates, CNI plugin configuration, and ingress controller maintenance. ECS on Fargate eliminates all of that — AWS manages compute, scaling, and patching. The trade-off is less customization, but the scenario never asked for Kubernetes-specific features like custom operators or CRDs.

How to Spot It

  • "Minimize operational overhead," "small team," "reduce management burden" — these phrases are signals to choose the most managed option. ECS Fargate over EKS self-managed nodes. Aurora over self-managed PostgreSQL on EC2. Lambda over always-on containers for event-driven workloads.
  • EKS managed node groups reduce operational burden compared to self-managed nodes, but you still manage node AMI updates, pod scaling, and cluster upgrades. EKS with Fargate eliminates node management entirely but loses DaemonSet support and some storage options. The exam tests these operational trade-offs at each level.
  • Self-managed options (EC2, EKS self-managed, self-hosted databases, self-managed Kafka) are only correct when the scenario explicitly requires a capability that managed services cannot provide — custom kernel modules, specific OS versions, or unsupported database engines.

Decision Rules

Whether AWS Control Tower Account Factory with AFT customization hooks satisfies every stated constraint (encryption guardrail, centralized logging, baseline VPC, scalability) and should be chosen over a custom-built vending pipeline — selecting the managed service eliminates ongoing orchestration maintenance while preserving full guardrail and baseline fidelity.

AWS Control TowerAWS OrganizationsAWS Service Catalog

Whether managed cross-Region DR services (AWS Elastic Disaster Recovery for EC2, Aurora Global Database for the database tier, Route 53 health-check failover) satisfy the stated RPO/RTO targets with sustainable operational overhead, compared to a custom-built approach using scheduled cross-Region AMI copies, Aurora snapshot replication, and Lambda/Systems Manager failover orchestration that appears equivalently capable but hides AMI-copy latency variability, snapshot-restore duration, and ongoing runbook maintenance that collectively make RTO compliance unreliable.

AWS Elastic Disaster RecoveryAmazon AuroraAmazon Route 53

Choose the caching tier—purpose-built database-integrated accelerator (DAX) paired with edge caching (CloudFront) versus a general-purpose in-memory cluster (ElastiCache) inserted between the application and DynamoDB—that satisfies the latency SLA without shifting cache invalidation complexity onto the application team.

Amazon DynamoDBAmazon ElastiCacheAmazon CloudFront

Whether to satisfy continuous PCI-DSS resource-configuration compliance evidence using managed evaluation controls (Config conformance packs delegated via Organizations + Security Hub PCI-DSS standard + Audit Manager automated evidence mapping) versus building a custom compliance pipeline on top of CloudTrail log aggregation and ad-hoc query/Lambda orchestration.

AWS ConfigAWS Security HubAWS Audit Manager

Whether a customer-managed KMS CMK in the standard KMS software keystore or a CloudHSM-backed KMS custom key store is the minimum compliant configuration when the mandate explicitly requires FIPS 140-2 Level 3 HSM custody of key material.

AWS Key Management Service (AWS KMS)AWS CloudHSMAmazon S3

Whether to scale read capacity horizontally by adding Aurora read replicas behind the native reader endpoint combined with a managed ElastiCache tier, versus scaling the Aurora writer instance vertically or deploying self-managed caching and connection-pooling infrastructure that shifts operational and code-change burden onto the application layer.

Amazon AuroraAmazon ElastiCache

Heterogeneous engine migration (Oracle → Aurora PostgreSQL) requires SCT to automate schema and stored-procedure conversion before DMS performs ongoing replication; omitting SCT and substituting manual or custom conversion scripts satisfies the data-movement requirement but creates unquantified operational complexity that expands the effective cutover window beyond the stated constraint and violates an explicit compliance directive.

AWS Database Migration Service (AWS DMS)AWS Schema Conversion Tool (AWS SCT)AWS Snow Family

Whether to centralize multi-VPC-to-on-premises connectivity through a Transit Gateway attached to a Direct Connect gateway—providing transitive routing and a single control plane—or to provision individual Direct Connect private VIFs per VPC (even when automated), which creates linear management burden and does not support transitive routing at scale.

AWS Direct ConnectAWS Transit GatewayAmazon Virtual Private Cloud (Amazon VPC)

Choose between AWS Control Tower's managed landing zone with Account Factory versus a manually assembled AWS Organizations plus SCP plus AWS Config conformance pack stack, where the deciding constraint is minimizing ongoing operational overhead for guardrail maintenance, drift detection, and per-account baselining as the account estate grows over 18 months.

AWS Control TowerAWS OrganizationsAWS IAM Identity Center

Attach a deny-all-non-approved-regions SCP at the OU level in AWS Organizations rather than deploying per-account Config rules, IAM permission boundaries, or enabling Control Tower solely to gain guardrail syntax—because only an OU-scoped SCP provides a preventive ceiling that is centrally authored once, inherited automatically by all present and future member accounts, and cannot be overridden by account-level IAM.

AWS OrganizationsAWS ConfigAWS Control Tower

Whether the fully managed observability stack (Amazon Managed Service for Prometheus + CloudWatch composite alarms with anomaly detection + AWS X-Ray) satisfies the sub-five-minute MTTD target at lower and sustainable operational overhead than a self-managed Prometheus deployment that a two-engineer team cannot maintain without introducing MTTD regression risk.

Amazon CloudWatchAmazon Managed Service for PrometheusAWS X-Ray

Whether the MTTD-under-five-minutes and low-operational-overhead constraints are jointly satisfied by AWS-managed observability services (CloudWatch Container Insights, Amazon Managed Service for Prometheus, AWS X-Ray) or by a self-managed Prometheus stack on EKS that meets the MTTD target but violates the operational ceiling by requiring scrape config ownership, retention tuning, and HA management.

Amazon CloudWatchAmazon Managed Service for PrometheusAWS X-Ray

Whether stateful, approval-gated orchestration (Step Functions waitForTaskToken + SSM Automation) is required instead of a lightweight event-notification pipeline (EventBridge → SNS → Lambda) or a single-service automation tool (SSM Automation alone) when the scenario explicitly mandates approval gates, automatic rollback on failure, and a durable consolidated audit trail.

AWS Systems ManagerAmazon EventBridgeAWS Step Functions

Whether the stated RTO < 30 min and RPO < 5 min targets can be fully satisfied by Multi-AZ redundancy combined with managed auto-scaling within a single Region, or whether cross-Region warm-standby architecture is necessary — with the 'minimal operational overhead' constraint serving as the decisive filter that eliminates cross-Region options before their resilience merits are evaluated.

Elastic Load Balancing (ELB)Amazon EC2 Auto ScalingAmazon Aurora

When a workload is stateless, short-duration, and bursty with long daily idle windows, Lambda plus API Gateway satisfies both minimize-operational-overhead and pay-per-use-efficiency simultaneously; a container-based alternative such as Fargate with ECS imposes task sizing, minimum running-task count, and cluster configuration overhead that directly violates the dominant constraint even though it eliminates EC2 instance management.

AWS LambdaAmazon API GatewayAmazon DynamoDB

Whether to apply Savings Plans commitment to the existing m5.2xlarge fleet immediately (capturing the SP discount but preserving the over-provisioned instance type), or to use AWS Compute Optimizer utilization recommendations to downsize the instance type first and then apply Savings Plans to the right-sized fleet, capturing both the sizing saving and the pricing-model discount.

AWS Compute OptimizerAmazon EC2 Auto ScalingSavings Plans

Domain Coverage

Design Solutions for Organizational ComplexityDesign for New SolutionsContinuous Improvement for Existing SolutionsAccelerate Workload Migration and Modernization

Difficulty Breakdown

Hard: 63Expert: 9Medium: 5

Related Patterns