Near-Right Architecture — AWS DevOps Engineer (DOP-C02)
Two options were architecturally valid — you picked the one that violates a constraint buried in the scenario. Read constraints before evaluating answers.
Technically Correct, Professionally Wrong
The distractor uses real AWS services assembled in a recognizable pattern. It would pass a whiteboard review. The constraint it misses isn't obvious — it's usually cost at scale, cross-account blast radius, or operational ownership. DOP-C02 scenarios are written so that the wrong answer works in an unconstrained environment. Identify the governing constraint first; eliminate answers that don't satisfy it even when the rest of the architecture looks sound.
The Scenario
A company needs a real-time analytics dashboard querying petabytes of log data. The question offers Athena with S3 and Redshift Serverless. Both query structured data at scale. But the scenario says "sub-second response times for repeated queries" — Athena scans S3 on every query (seconds to minutes), while Redshift caches results and returns sub-second on repeats. The constraint is latency on repeated queries, not raw query capability. You picked Athena because it is serverless and cheaper per query, but the access pattern eliminates it.
How to Spot It
- •When both answers use real AWS services that address the primary use case, re-read for the performance constraint. "Sub-second," "real-time," "single-digit millisecond" each eliminate different services. Athena is not sub-second. DynamoDB is not for complex joins. Aurora is not for petabyte-scale analytics.
- •Look for protocol-level constraints. If the scenario says TCP traffic with client IP preservation, that eliminates CloudFront (HTTP/HTTPS only) and points to Global Accelerator + NLB. If it says HTTP with caching, that eliminates Global Accelerator.
- •If you find yourself thinking "both could work," the exam is testing constraint reading. Check for: latency target, protocol, data volume, ordering requirement, or compliance region restriction.
Decision Rules
Which ECS deployment controller — native rolling update or CodeDeploy blue/green — satisfies a hard sub-minute rollback SLA, given that both strategies can achieve zero-downtime during the happy path.
Whether to configure the CodeDeploy Lambda deployment with a gradual traffic-shift preference (canary or linear) that bounds blast radius and enables meaningful pre-promotion alarm-triggered rollback, or accept an all-at-once shift that maximizes speed but exposes 100% of traffic before the alarm evaluation window can protect customers.
Select organizational-level preventive enforcement (SCP deny via AWS Organizations, enforced automatically at account enrollment by Control Tower) over account-level detective controls (Config rules with auto-remediation) when the requirement is to block a disallowed API call before it executes rather than detect and correct it afterward.
Whether to architect per-service independent CodePipeline pipelines using CodeDeploy blue/green traffic shifting (with automatic rollback on CloudWatch alarm) or to consolidate services into a single shared pipeline using in-place deployment — when the dominant constraint is bounded, service-isolated rollback rather than pipeline setup simplicity.
Whether adding a native CodeBuild test action inside a CodePipeline stage satisfies the fail-fast and feedback-loop-speed constraints with less operational overhead than introducing a Step Functions state machine to orchestrate the same test execution.
Whether to consolidate both artifact types into versioned S3 buckets or route each type to its purpose-built repository service — CodeArtifact for Maven packages and ECR for container images — to satisfy artifact-immutability and automated scanning with the lowest operational overhead.
Whether native SSM Automation runbooks distributed via AWS Organizations (zero custom code, managed propagation) outweigh a Step Functions + Lambda custom orchestration pipeline (richer logic, but per-account deployment and code maintenance) when the dominant constraint is operational-complexity-budget for a small team.
Whether centralized push-model lifecycle management (StackSets with SERVICE_MANAGED permissions and scheduled drift detection) or pull-model governed provisioning (Service Catalog portfolio with launch constraints) satisfies the simultaneous requirements of template propagation to existing deployments and bounded out-of-band drift detection across all member accounts.
Choose the compute abstraction level — EC2 Auto Scaling behind a load balancer versus Lambda fronted by API Gateway — that simultaneously satisfies the bursty-traffic scaling-latency-budget (near-instant scale-out) and the team's operational-complexity ceiling (no AMI lifecycle, no cluster management).
Whether to route continuous log-derived metrics through a push-based pipeline (CloudWatch Logs metric filters emitting CloudWatch custom metrics, or subscription filters fanning out to Kinesis Data Firehose) versus relying on pull-based ad-hoc queries (CloudWatch Logs Insights) scheduled externally to approximate near-real-time dashboard freshness.
Whether EventBridge content-based rules with direct Lambda targets satisfy the delivery-guarantee and sub-second latency requirement at lower operational cost than an SNS fan-out with SQS queue buffers feeding Lambda consumers.
Whether to initiate rollback at the application-deployment layer via CodeDeploy automatic rollback on a CloudWatch alarm, or at the infrastructure-stack layer via a CloudFormation stack rollback, when the failure manifests as application-layer ECS task health rather than infrastructure configuration drift.
Whether the identity governance architecture must include all three layers—SCP deny guardrails, IAM Identity Center permission sets for grant scoping, and Secrets Manager automated rotation—or whether a near-right design using IAM Identity Center plus per-account IAM roles with manual or Lambda-triggered rotation satisfies the stated constraints.
Domain Coverage
Difficulty Breakdown
Related Patterns