GCP · PCA

Observability Blind Spot — GCP Professional Cloud Architect (PCA)

You missed a monitoring or logging requirement. The exam tests whether you know what to observe, not just what to build.

Monitoring Is Not Tracing — The Exam Knows the Difference

The scenario describes intermittent latency spikes between two microservices. The instinct is to stand up a Cloud Monitoring dashboard. Monitoring surfaces infrastructure metrics — CPU, memory, request count — but cannot show where time is spent across a multi-hop request chain. Cloud Trace captures end-to-end request timelines and per-span latency breakdowns. Selecting Monitoring produces an aggregated rate with no causal signal; selecting Trace produces the distributed timeline the scenario actually requires.

6%of exam questions affected (12 of 200)

The Scenario

The application has slow responses. Per-service monitoring shows all services are within normal parameters — CPU low, memory fine, no errors. You recommended more monitoring dashboards. The correct answer is distributed tracing that correlates a single request across all services. The bottleneck is at a service boundary — a downstream call, a queue delay, or a database query — invisible to per-service metrics.

How to Spot It

  • Match the observability tool to the diagnostic question. Infrastructure metrics answer "is the hardware overloaded?" Logs answer "what error occurred?" Distributed traces answer "where does the request spend its time?"
  • Distributed architectures create observability gaps at every service boundary. If the scenario describes a multi-service application with unexplained latency, the answer is always distributed tracing, not more dashboards.
  • When the scenario says "diagnose," "troubleshoot," or "identify root cause," basic monitoring with thresholds and alarms is insufficient. These words signal the need for request-level correlation and dependency analysis.

Decision Rules

When inter-service p99 latency spikes with no error rate change, the correct observability layer is distributed trace correlation (Cloud Trace) not structured log search (Cloud Logging), because only trace span data reconstructs the full call graph and exposes the slowest hop automatically.

Cloud TraceCloud LoggingCloud Monitoring

When a p99 latency regression spans a multi-service call chain with no corresponding error rate change, the architect must select Cloud Trace — which correlates distributed spans across all six service boundaries and ranks hops by latency contribution — rather than Cloud Logging, which cannot reconstruct the cross-service timing graph without manual trace-header joins and yields no ranked latency breakdown by hop.

Cloud TraceCloud LoggingCloud Monitoring

When the diagnostic question is 'which code path within a single service is consuming excess CPU,' Cloud Profiler is required because it provides flame-graph attribution at the function level; Cloud Monitoring can confirm that CPU is elevated but cannot attribute utilization to a specific call site, making it the wrong tool for causal, code-scoped root-cause analysis.

Cloud ProfilerCloud MonitoringError Reporting

Domain Coverage

Ensuring Solution and Operations Reliability

Difficulty Breakdown

Medium: 4Expert: 8

Related Patterns