Performance Architecture — Azure DevOps Engineer (AZ-400)
Latency at the Edge vs. Latency at the Origin
Requirement: serve static assets to users in Southeast Asia and Europe with sub-50ms load time, while routing API traffic to the nearest healthy backend. Competing services: Azure CDN for static delivery, Azure Front Door for global load balancing with health probes. The deciding constraint is where the latency problem lives—at the edge (CDN solves it) or at the origin routing layer (Front Door solves it). Most production architectures need both; the exam scenario will specify which constraint is currently unsatisfied.
What This Pattern Tests
Azure performance architecture tests whether you match the scaling mechanism to the bottleneck. For AI-300, Azure ML managed online endpoints auto-scale based on request latency or CPU, while batch endpoints handle offline scoring at lower cost. Azure OpenAI provisioned throughput units (PTUs) guarantee tokens-per-minute for production workloads, while standard deployment uses shared capacity with rate limits. For AZ-400, Azure Pipelines parallel jobs determine build throughput — free tier gives 1 parallel job with 1,800 minutes/month, while paid agents provide unlimited minutes. Azure Test Plans load testing identifies performance ceilings before deployment. The trap is adding more compute when the bottleneck is API rate limiting, or provisioning PTUs for a dev/test workload that should use standard deployment.
Decision Axis
Performance lever depends on the service: compute scaling vs. throughput provisioning vs. rate limit management vs. data partitioning.
Associated Traps
More Top Traps on This Exam
Decision Rules
Whether to expand telemetry collection by raising Container Insights granularity and disabling Application Insights adaptive sampling to capture every request (over-provisioning, busts ingestion budget) or to write targeted KQL queries joining the existing Perf, KubePodInventory, requests, and dependencies tables to correlate P99 latency with infrastructure metrics within the current ingestion commitment.
Whether to expand Azure Monitor diagnostic settings ingestion across additional telemetry categories to maximize data availability, or to author a targeted KQL query joining existing AppRequests and Perf tables already present in the shared Log Analytics workspace to achieve the required correlation within the current spend envelope.
Domain Coverage
Difficulty Breakdown