AWS · SOA-C03

Observability Blind Spot — AWS SysOps Administrator (SOA-C03)

You missed a monitoring or logging requirement. The exam tests whether you know what to observe, not just what to build.

Visibility Isn't the Same as Root Cause

Requirement: diagnose latency spikes across a distributed microservices application. Competing tools: CloudWatch metrics, CloudWatch Logs, X-Ray. The deciding constraint is trace depth. CloudWatch tells you that latency increased; X-Ray shows you which service in the call chain introduced it and why. The exam credits the tool that closes the gap between symptom and origin — not the one that first surfaces the anomaly.

10%of exam questions affected (19 of 200)

The Scenario

A serverless application with API Gateway, Lambda, and DynamoDB returns slow responses. CloudWatch shows Lambda duration averages 5 seconds but no errors. You recommend CloudWatch alarms on duration metrics. The correct answer is enabling X-Ray tracing to identify which downstream call is slow. X-Ray traces the full request: API Gateway routing (200ms), Lambda cold start (800ms), DynamoDB query (3,200ms), response serialization (800ms). The bottleneck is a DynamoDB scan operation that should be a query. CloudWatch metrics tell you Lambda is slow; X-Ray tells you why. The scenario asked "diagnose latency issues" — that requires request-level tracing, not service-level metrics.

How to Spot It

  • CloudWatch Metrics shows aggregated health (error rates, duration, throttles). CloudWatch Logs shows individual event details (error messages, stack traces). X-Ray shows request flow across services (where time is spent, which service is the bottleneck). Match the diagnostic tool to the question: "Is it broken?" = Metrics. "What happened?" = Logs. "Where is the bottleneck?" = X-Ray.
  • CloudWatch Container Insights for ECS/EKS provides CPU, memory, and network metrics per container. But if the scenario asks why inter-service calls are slow, Container Insights shows resource utilization, not request-level latency. You need X-Ray or Application Signals for request tracing across services.
  • Distributed architectures (Lambda calling SQS calling another Lambda calling DynamoDB) create blind spots at every service boundary. Per-service metrics cannot show that the bottleneck is in the SQS consumer, not the producer. The exam tests whether you recognize when distributed tracing is required.

Decision Rules

Select the continuous-compliance evaluation layer (AWS Config managed rule with SSM Automation auto-remediation) over an event-log layer (AWS CloudTrail with EventBridge alerting) when the requirement is ongoing resource-state detection plus automated enforcement rather than point-in-time API capture.

AWS ConfigAWS CloudTrailAWS Systems Manager

Select the alarm source — native managed-service metric versus log-derived metric filter — whose detection latency fits within the stated RTO; native ELB metrics win because they publish on the health-check interval with no log ingestion lag, while a Logs metric filter cannot alarm until access log delivery completes.

Amazon CloudWatchElastic Load BalancingAmazon Route 53

Select the alarm source—native Route 53 health check metric vs. log-derived CloudWatch metric filter—that surfaces a failure signal within the 90-second RTO, given that log ingestion pipeline latency alone can consume 60–300 seconds before a filter alarm transitions to ALARM state.

Amazon CloudWatchElastic Load BalancingAmazon Route 53

Whether to alarm on a native CloudWatch metric sourced directly from ELB (HealthyHostCount), which surfaces application-layer target failures within one metric emission period, versus AWS Health event notifications, which cover only AWS-managed infrastructure events and produce no signal when the AWS infrastructure is healthy but the application is not.

Amazon CloudWatchElastic Load BalancingAWS Health

Whether to create a CloudWatch Logs metric filter with pattern matching on userIdentity.type = Root against the existing log group — emitting a custom metric consumed by a CloudWatch alarm — or enable a separate detection service such as GuardDuty or AWS Config that introduces additional latency and cost beyond what the already-delivered log stream requires.

Amazon CloudWatch LogsAmazon CloudWatchAWS CloudTrail

Domain Coverage

Monitoring, Logging, Analysis, Remediation, and Performance OptimizationSecurity and Compliance

Difficulty Breakdown

Medium: 15Hard: 4

Related Patterns