AWS · DOP-C02

Rpo Rto Confusion — AWS DevOps Engineer (DOP-C02)

You confused recovery point objective (data loss tolerance) with recovery time objective (downtime tolerance). Different requirements, different architectures.

The Recovery Target You're Actually Being Tested On

A scenario gives you an RTO of 15 minutes and an RPO of 1 hour. You spot Multi-AZ immediately — automatic failover, sub-minute cutover — and that instinct is correct for RTO. But Multi-AZ replication is synchronous, not a data-age guarantee against corruption or logical failure. The exam is asking whether you can distinguish the time-to-restore constraint from the data-loss-tolerance constraint. Choosing the strategy that satisfies one while ignoring the other is the designed failure.

34%of exam questions affected (68 of 200)

The Scenario

A financial application requires RPO of 1 hour and RTO of 15 minutes. You design a Pilot Light strategy with Aurora read replicas in the DR region using asynchronous replication. Pilot Light infrastructure can spin up in 15 minutes (meets RTO). But asynchronous replication to a read replica can lag by several hours during peak loads — if the primary fails during a replication lag spike, you lose more than 1 hour of transactions (violates RPO). The correct answer is Warm Standby with Aurora Global Database, which replicates with typical lag under 1 second and provides a pre-scaled environment for fast failover. You satisfied RTO but forgot to verify RPO independently.

How to Spot It

•RPO drives your replication strategy: RPO of 0 requires synchronous replication (Multi-AZ RDS, Aurora Multi-AZ). RPO of 1 hour can use asynchronous replication if the lag is bounded under 1 hour (Aurora Global Database typical lag < 1 second). RPO of 24 hours can use daily snapshots.
•RTO drives your failover infrastructure: RTO of minutes requires Warm Standby or Multi-Site Active-Active with pre-provisioned compute. RTO of hours allows Pilot Light (minimal infrastructure, scaled up on failover). RTO of days allows Backup and Restore from S3/snapshots.
•When a question gives both RPO and RTO, evaluate each independently against every answer option. An answer that meets RTO but fails RPO is wrong. The exam specifically designs options that satisfy one but not the other.

Decision Rules

When both zero-downtime and a sub-60-second rollback RTO are explicit requirements for an ECS workload, blue/green deployment (retaining the original task set for instant ALB listener re-route) must be chosen over rolling update (which requires launching new tasks to roll back, making rollback latency minutes rather than seconds).

AWS CodeDeployAmazon Elastic Container Service (Amazon ECS)Elastic Load Balancing (ELB)

Whether to use CodeDeploy ECS rolling update or CodeDeploy ECS blue/green — the 60-second rollback RTO is the deciding constraint because rolling update has no independent stable target group to atomically reroute to, while blue/green can flip the ALB listener rule back in seconds.

AWS CodeDeployAmazon Elastic Container Service (Amazon ECS)Elastic Load Balancing (ELB)

Whether a 90-second rollback RTO can be satisfied by CodeDeploy in-place rolling deployment on EC2 or whether blue/green deployment with ALB traffic rerouting to the original Auto Scaling group is required to meet both the zero-downtime and sub-90-second rollback constraints simultaneously.

AWS CodeDeployAmazon EC2 Auto ScalingElastic Load Balancing (ELB)

Whether to use per-service CodeDeploy blue/green deployments with independent rollback groups — which atomically restore the prior known-good target group (satisfying both rollback initiation speed and clean rollback-destination state) — versus a shared rolling deployment that meets rollback initiation time but restores services to a partially-applied mixed state rather than a clean prior artifact version.

AWS CodePipelineAWS CodeDeployAWS Secrets Manager

Whether to use per-service CodePipelines with CodeDeploy blue/green traffic shifting or a single shared CodePipeline with CodeDeploy rolling updates — determined by whether the rollback mechanism provides a deterministic sub-2-minute recovery time regardless of fleet size.

AWS CodePipelineAWS CodeDeployAWS CodeBuild

Whether to replicate artifacts using CodeArtifact cross-region replication plus ECR cross-region replication — preserving dependency-resolution graphs, package metadata, and image-scan results alongside binaries — or to consolidate onto S3 CRR, which replicates object bytes quickly enough to satisfy the RTO but cannot replicate the metadata layer required to satisfy the 5-minute RPO for full build reproducibility.

AWS CodeArtifactAmazon Elastic Container Registry (Amazon ECR)Amazon S3

Whether CloudFormation StackSets alone satisfies both a self-service provisioning RTO and a continuous drift-detection RPO, or whether two separate services — Service Catalog portfolios for RTO and AWS Config continuous evaluation for RPO — are required because StackSets drift detection is on-demand and provisioning is admin-gated.

AWS Service CatalogAWS CloudFormationAWS Config

Whether to enforce infrastructure parameter guardrails at authoring time (CDK constructs or CloudFormation modules, which harden defaults at synthesis or template-creation time) or at provisioning time (Service Catalog portfolios with launch constraints, which block parameter overrides at the moment a developer initiates a stack deployment) — the scenario's 'no parameter override during provisioning' constraint disqualifies authoring-time approaches because a developer with template access can still override values at stack-create time.

AWS Service CatalogAWS CloudFormationAWS Cloud Development Kit (AWS CDK)

Whether detection latency (the 2-minute non-compliance visibility window, analogous to RPO) requires AWS Config change-triggered evaluation, independent of whether the chosen remediation path (analogous to RTO) can meet the 10-minute SLA.

AWS ConfigAmazon EventBridgeAWS Systems Manager

Whether to retain EC2 Auto Scaling, which provides steady-state Multi-AZ HA but cannot deliver sub-60-second new-capacity provisioning during burst events, or migrate to Lambda fronted by API Gateway, which provides sub-second concurrent scaling that satisfies both the scale-out RTO and the zero-request-loss RPO in burst scenarios.

AWS LambdaAmazon API GatewayAmazon EC2 Auto Scaling

Whether to derive dashboard metrics through a continuous in-stream mechanism (CloudWatch Logs metric filter emitting a custom CloudWatch metric) or through a periodic batch query mechanism (CloudWatch Logs Insights scheduled query), given a hard sub-60-second dashboard freshness requirement and a lowest-operational-overhead constraint.

Amazon CloudWatch LogsAmazon CloudWatchAmazon Kinesis Data Firehose

Whether to use AWS Config continuous evaluation paired with a managed SSM Automation document via Config auto-remediation — satisfying both the detection-frequency constraint (no polling gap, RPO-equivalent) and the automated end-to-end remediation SLA (deterministic, code-free execution, RTO-equivalent) — versus a solution that captures compliance state reliably but routes remediation through a notification or custom-code path that cannot guarantee the time-bound automated fix.

AWS ConfigAWS Systems ManagerAmazon EventBridge

Whether EventBridge's native retry policy plus a Lambda dead-letter queue satisfies the zero-event-loss delivery guarantee within the 90-second latency ceiling, or whether inserting SQS between EventBridge and Lambda is required to prevent event loss—recognising that the SQS polling model trades deterministic push latency for buffered pull latency, potentially violating the 90-second processing window (RTO analog) while adding queue-management overhead the scenario does not justify.

Amazon EventBridgeAmazon Simple Queue Service (Amazon SQS)AWS Lambda

Choose between AWS Config auto-remediation invoking a managed SSM Automation document (declarative, no custom code, purpose-built MTTR path) versus routing Config compliance-change events through EventBridge to a custom Lambda function (imperative, adds cold-start latency, custom IAM surface, and bespoke failure-handling code), where the stated MTTR constraint plus the zero-custom-code requirement together disqualify the Lambda path.

AWS ConfigAWS Systems ManagerAmazon EventBridge

Match the rollback tool to the exact failure layer — application deployment (CodeDeploy) versus infrastructure stack (CloudFormation) — given a hard five-minute RTO that infrastructure-layer rollback cannot satisfy for running ECS task sets.

AWS CodeDeployAmazon Elastic Container Service (Amazon ECS)AWS CloudFormation

Whether the stated RPO of 30 seconds can be met by RDS cross-region read replica replication lag or requires Aurora Global Database's physical storage replication, which provides RPO under one second regardless of write throughput.

Amazon RDSAmazon AuroraAmazon Route 53

Given explicit RPO (15 min) and RTO (30 min) constraints against a large Aurora dataset with a cost-efficiency preference, determine which DR tier — backup-restore or continuous replication with automated promotion — satisfies both targets simultaneously, and recognize that snapshot recency addresses RPO but does not bound restore duration for the RTO.

Amazon AuroraAWS BackupAmazon Route 53

Domain Coverage

SDLC AutomationConfiguration Management and IaCResilient Cloud SolutionsMonitoring and LoggingIncident and Event Response

Difficulty Breakdown

Hard: 40Expert: 24Medium: 4

Related Patterns

Multi-Service Tradeoff54% of exam Deployment And Delivery Design10% of exam Resilience Architecture8% of exam

Start training this trap →

See all DOP-C02 traps →