AWS · DOP-C02

Multi-Service Tradeoff — AWS DevOps Engineer (DOP-C02)

54%of exam questions (108 of 200)

Container Orchestration Choice Is a Constraint Problem

ECS and EKS both run containers. Lambda removes the server entirely. SQS decouples the work. The exam doesn't ask which service is more capable — it asks which combination satisfies the specific mix of operational control, scaling latency, and team expertise the scenario defines. A workload requiring Kubernetes-native tooling and existing Helm charts points to EKS. A workload requiring minimal cluster management with AWS-native integration points to ECS. Identify the dominant professional constraint before matching services.

What This Pattern Tests

The exam gives you a decoupling requirement and tests whether you pick the right messaging service. SQS is point-to-point with at-least-once delivery (Standard) or exactly-once (FIFO, 3,000 msg/s with batching). SNS is pub/sub fan-out to multiple subscribers. EventBridge is content-based routing with schema registry and 35+ AWS service sources. The trap is choosing SQS for fan-out (use SNS) or SNS for ordered processing (use SQS FIFO). DynamoDB vs. Aurora vs. ElastiCache follows the same pattern: key-value at any scale vs. relational joins vs. microsecond reads from memory.

Decision Axis

Communication pattern (point-to-point vs. fan-out vs. content routing) and data access pattern (key-value vs. relational vs. cache) determine the service.

Associated Traps

Decision Rules

Whether to architect per-service independent CodePipeline pipelines using CodeDeploy blue/green traffic shifting (with automatic rollback on CloudWatch alarm) or to consolidate services into a single shared pipeline using in-place deployment — when the dominant constraint is bounded, service-isolated rollback rather than pipeline setup simplicity.

AWS CodePipelineAWS CodeDeployAWS Secrets Manager

Whether adding a native CodeBuild test action inside a CodePipeline stage satisfies the fail-fast and feedback-loop-speed constraints with less operational overhead than introducing a Step Functions state machine to orchestrate the same test execution.

AWS CodePipelineAWS CodeBuildAWS Step Functions

Whether to consolidate both artifact types into versioned S3 buckets or route each type to its purpose-built repository service — CodeArtifact for Maven packages and ECR for container images — to satisfy artifact-immutability and automated scanning with the lowest operational overhead.

AWS CodeArtifactAmazon Elastic Container Registry (Amazon ECR)Amazon S3

Whether to use per-service CodeDeploy blue/green deployments with independent rollback groups — which atomically restore the prior known-good target group (satisfying both rollback initiation speed and clean rollback-destination state) — versus a shared rolling deployment that meets rollback initiation time but restores services to a partially-applied mixed state rather than a clean prior artifact version.

AWS CodePipelineAWS CodeDeployAWS Secrets Manager

Whether native CodeBuild batch builds (or parallel CodeBuild actions in one pipeline stage) with CloudWatch test reports satisfies the parallel-execution, fail-fast, and dashboard requirements at lower operational cost than a custom Step Functions + Lambda orchestration layer.

AWS CodeBuildAWS CodePipelineAmazon CloudWatch

Whether to route each artifact type to its purpose-built repository service (CodeArtifact for Maven, ECR for Docker images) or consolidate both types into a single general-purpose store and layer custom tooling for dependency resolution, metadata indexing, and vulnerability scanning on top.

AWS CodeArtifactAmazon Elastic Container Registry (Amazon ECR)AWS CodeBuild

Whether to use per-service CodePipelines with CodeDeploy blue/green traffic shifting or a single shared CodePipeline with CodeDeploy rolling updates — determined by whether the rollback mechanism provides a deterministic sub-2-minute recovery time regardless of fleet size.

AWS CodePipelineAWS CodeDeployAWS CodeBuild

Whether native CodeBuild test actions inside a CodePipeline stage satisfy the integration-test gating requirement through built-in exit-code-driven stage failure, or whether external Step Functions orchestration is necessary to meet the stated constraint.

AWS CodeBuildAWS CodePipelineAWS Step Functions

Whether to replicate artifacts using CodeArtifact cross-region replication plus ECR cross-region replication — preserving dependency-resolution graphs, package metadata, and image-scan results alongside binaries — or to consolidate onto S3 CRR, which replicates object bytes quickly enough to satisfy the RTO but cannot replicate the metadata layer required to satisfy the 5-minute RPO for full build reproducibility.

AWS CodeArtifactAmazon Elastic Container Registry (Amazon ECR)Amazon S3

Whether to publish approved infrastructure patterns as Service Catalog portfolios with launch constraints (governed self-service with scoped IAM, built-in drift tracking) or share CloudFormation modules via a private registry (composition-time reusability that still requires broad IAM for each team and leaves drift detection as an independently wired operational burden per account).

AWS Service CatalogAWS CloudFormationAWS Config

Whether native SSM Automation runbooks distributed via AWS Organizations (zero custom code, managed propagation) outweigh a Step Functions + Lambda custom orchestration pipeline (richer logic, but per-account deployment and code maintenance) when the dominant constraint is operational-complexity-budget for a small team.

AWS Systems ManagerAWS Step FunctionsAWS Lambda

Whether CloudFormation StackSets alone satisfies both a self-service provisioning RTO and a continuous drift-detection RPO, or whether two separate services — Service Catalog portfolios for RTO and AWS Config continuous evaluation for RPO — are required because StackSets drift detection is on-demand and provisioning is admin-gated.

AWS Service CatalogAWS CloudFormationAWS Config

Whether native SSM Automation runbooks propagated via State Manager Organizations-level associations satisfy the compliance enforcement requirement with lower total operational cost than a custom Step Functions + Lambda orchestration deployed into each member account.

AWS Systems ManagerAWS Step FunctionsAWS Lambda

Whether to enforce infrastructure parameter guardrails at authoring time (CDK constructs or CloudFormation modules, which harden defaults at synthesis or template-creation time) or at provisioning time (Service Catalog portfolios with launch constraints, which block parameter overrides at the moment a developer initiates a stack deployment) — the scenario's 'no parameter override during provisioning' constraint disqualifies authoring-time approaches because a developer with template access can still override values at stack-create time.

AWS Service CatalogAWS CloudFormationAWS Cloud Development Kit (AWS CDK)

Whether detection latency (the 2-minute non-compliance visibility window, analogous to RPO) requires AWS Config change-triggered evaluation, independent of whether the chosen remediation path (analogous to RTO) can meet the 10-minute SLA.

AWS ConfigAmazon EventBridgeAWS Systems Manager

Whether centralized push-model lifecycle management (StackSets with SERVICE_MANAGED permissions and scheduled drift detection) or pull-model governed provisioning (Service Catalog portfolio with launch constraints) satisfies the simultaneous requirements of template propagation to existing deployments and bounded out-of-band drift detection across all member accounts.

AWS CloudFormationAWS Service CatalogAWS Config

Whether to enforce patch compliance at scale using native SSM Automation runbooks propagated as State Manager associations via Organizations — requiring no per-account code deployment — or to build a Step Functions + Lambda orchestration that can achieve the same remediation but incurs per-account deployment, IAM role provisioning, and code maintenance overhead the two-person team cannot sustainably absorb.

AWS Systems ManagerAWS Step FunctionsAWS Lambda

Choose the compute abstraction level — EC2 Auto Scaling behind a load balancer versus Lambda fronted by API Gateway — that simultaneously satisfies the bursty-traffic scaling-latency-budget (near-instant scale-out) and the team's operational-complexity ceiling (no AMI lifecycle, no cluster management).

AWS LambdaAmazon API GatewayAmazon EC2 Auto Scaling

Whether to retain EC2 Auto Scaling, which provides steady-state Multi-AZ HA but cannot deliver sub-60-second new-capacity provisioning during burst events, or migrate to Lambda fronted by API Gateway, which provides sub-second concurrent scaling that satisfies both the scale-out RTO and the zero-request-loss RPO in burst scenarios.

AWS LambdaAmazon API GatewayAmazon EC2 Auto Scaling

Whether to route continuous log-derived metrics through a push-based pipeline (CloudWatch Logs metric filters emitting CloudWatch custom metrics, or subscription filters fanning out to Kinesis Data Firehose) versus relying on pull-based ad-hoc queries (CloudWatch Logs Insights) scheduled externally to approximate near-real-time dashboard freshness.

Amazon CloudWatch LogsAmazon CloudWatchAmazon Kinesis Data Firehose

Whether to derive dashboard metrics through a continuous in-stream mechanism (CloudWatch Logs metric filter emitting a custom CloudWatch metric) or through a periodic batch query mechanism (CloudWatch Logs Insights scheduled query), given a hard sub-60-second dashboard freshness requirement and a lowest-operational-overhead constraint.

Amazon CloudWatch LogsAmazon CloudWatchAmazon Kinesis Data Firehose

Whether EventBridge content-based rules with direct Lambda targets satisfy the delivery-guarantee and sub-second latency requirement at lower operational cost than an SNS fan-out with SQS queue buffers feeding Lambda consumers.

Amazon EventBridgeAWS HealthAWS Lambda

Whether to use AWS Config continuous evaluation paired with a managed SSM Automation document via Config auto-remediation — satisfying both the detection-frequency constraint (no polling gap, RPO-equivalent) and the automated end-to-end remediation SLA (deterministic, code-free execution, RTO-equivalent) — versus a solution that captures compliance state reliably but routes remediation through a notification or custom-code path that cannot guarantee the time-bound automated fix.

AWS ConfigAWS Systems ManagerAmazon EventBridge

Whether to initiate rollback at the application-deployment layer via CodeDeploy automatic rollback on a CloudWatch alarm, or at the infrastructure-stack layer via a CloudFormation stack rollback, when the failure manifests as application-layer ECS task health rather than infrastructure configuration drift.

AWS CodeDeployAmazon CloudWatchAmazon Elastic Container Service (Amazon ECS)

Whether EventBridge's native retry policy plus a Lambda dead-letter queue satisfies the zero-event-loss delivery guarantee within the 90-second latency ceiling, or whether inserting SQS between EventBridge and Lambda is required to prevent event loss—recognising that the SQS polling model trades deterministic push latency for buffered pull latency, potentially violating the 90-second processing window (RTO analog) while adding queue-management overhead the scenario does not justify.

Amazon EventBridgeAmazon Simple Queue Service (Amazon SQS)AWS Lambda

Choose between AWS Config auto-remediation invoking a managed SSM Automation document (declarative, no custom code, purpose-built MTTR path) versus routing Config compliance-change events through EventBridge to a custom Lambda function (imperative, adds cold-start latency, custom IAM surface, and bespoke failure-handling code), where the stated MTTR constraint plus the zero-custom-code requirement together disqualify the Lambda path.

AWS ConfigAWS Systems ManagerAmazon EventBridge

Match the rollback tool to the exact failure layer — application deployment (CodeDeploy) versus infrastructure stack (CloudFormation) — given a hard five-minute RTO that infrastructure-layer rollback cannot satisfy for running ECS task sets.

AWS CodeDeployAmazon Elastic Container Service (Amazon ECS)AWS CloudFormation

Domain Coverage

SDLC AutomationConfiguration Management and IaCResilient Cloud SolutionsMonitoring and LoggingIncident and Event Response

Difficulty Breakdown

Medium: 36Hard: 52Expert: 20