Over-Engineering — GCP Professional Cloud Architect (PCA)
You added unnecessary complexity — multi-region when single-region suffices, or a managed service when simpler meets requirements.
GKE Can Run It — That Doesn't Mean It Should
The candidate sees a container image and reaches for GKE. GKE is correct when the scenario specifies multi-container pods, custom node pools, stateful workloads, or operator-managed networking. If the scenario describes HTTP-triggered stateless request processing with no persistent volume and no mention of cluster configuration, GKE adds node management, upgrade windows, and autoscaler tuning with zero corresponding requirement. Cloud Run satisfies the workload and nothing more — which is what the exam rewards.
The Scenario
The question describes a simple requirement — share files, send notifications, host a static site. You chose an answer with multi-service architecture designed for scale, resilience, and extensibility. The correct answer uses 1-2 services. The exam penalizes unnecessary complexity because every additional service adds operational burden, cost, and failure surface area.
How to Spot It
- •Match solution complexity to requirement complexity. A single Lambda function is preferable to an ECS cluster for a function that runs once a day. An S3 bucket is preferable to a database for storing files that are never queried.
- •Count the services in your chosen answer, then count the requirements in the question. If services outnumber requirements by 2x or more, you are over-engineering.
- •Words like "small," "occasional," "simple," and "department" are anti-scale signals. The exam uses them to test whether you resist the urge to build for growth that was never mentioned.
Decision Rules
When the document type and extraction goal match a prebuilt Document AI processor, use the prebuilt processor rather than building a custom model via Vertex AI Training — the absence of labeled training data and the existence of a matching prebuilt capability are jointly sufficient to eliminate the custom-training path.
Whether the training workload's scale (single multi-GPU node, well below 1000 chips) and artifact registration requirement are satisfied by Vertex AI Training's managed custom jobs, or whether AI Hypercomputer's ultra-scale fabric is warranted — the answer hinges on the 1000+ chip threshold that triggers AI Hypercomputer's value proposition.
Whether a scale-to-zero serverless tier (Cloud Run) achieves lower total monthly cost than a discount-committed fixed VM tier (CUD on Compute Engine) when average utilization is low due to a bursty, time-bounded traffic pattern.
Apply a 1-year resource-based CUD to the stable production tier and Spot VMs to the interruptible batch tier, rather than consolidating both tiers under GKE Autopilot, which absorbs the per-tier pricing lever without delivering equivalent savings.
When a stateless HTTP workload carries >50% idle time, scale-to-zero serverless eliminates idle compute cost entirely and outperforms any percentage-based discount applied to always-on instances; CUD and SUD are only optimal when utilisation is continuous enough that committed or sustained cost beats zero-cost idle.
When a workload is already containerized, stateless, and HTTP-only, should the team replatform to GKE via Migrate for Anthos (leveraging Kubernetes-native packaging) or deploy directly to Cloud Run to satisfy both the migration timeline and the no-cluster-management constraint?
Whether GKE Autopilot's node-automation sufficiently removes operational overhead for a stateless scale-to-zero HTTP workload, or whether Cloud Run is the correct abstraction level because no Kubernetes-specific capability — pod networking, stateful workloads, multi-container sidecars — is required and the team cannot absorb the Kubernetes control-plane operational model.
When a containerized stateless HTTP workload has no pod networking, persistent volume, sidecar, or multi-container requirements, Cloud Run eliminates the Kubernetes control plane entirely — GKE Autopilot reduces node management but retains control plane overhead and Kubernetes API complexity that provide zero value for this workload class.
Domain Coverage
Difficulty Breakdown
Related Patterns