Performance Architecture — Azure Solutions Architect (AZ-305)
Three Latency Problems, Three Different Services
Three performance constraints resolve to three different services on AZ-305. 'Reduce latency for globally distributed static asset delivery' points to Azure CDN. 'Low-latency reads for high-frequency hot data' points to Azure Cache for Redis. 'Globally distributed write consistency with low latency' points to Cosmos DB multi-region active-active writes. Front Door adds intelligent global HTTP routing with WAF on top of CDN capability. Redis does not solve geographic distribution of static content, and CDN does not handle mutable application state. Identify which performance layer is in scope before evaluating any service.
What This Pattern Tests
Azure performance architecture tests whether you match the scaling mechanism to the bottleneck. For AI-300, Azure ML managed online endpoints auto-scale based on request latency or CPU, while batch endpoints handle offline scoring at lower cost. Azure OpenAI provisioned throughput units (PTUs) guarantee tokens-per-minute for production workloads, while standard deployment uses shared capacity with rate limits. For AZ-400, Azure Pipelines parallel jobs determine build throughput — free tier gives 1 parallel job with 1,800 minutes/month, while paid agents provide unlimited minutes. Azure Test Plans load testing identifies performance ceilings before deployment. The trap is adding more compute when the bottleneck is API rate limiting, or provisioning PTUs for a dev/test workload that should use standard deployment.
Decision Axis
Performance lever depends on the service: compute scaling vs. throughput provisioning vs. rate limit management vs. data partitioning.
Associated Traps
Decision Rules
Whether to use Azure Cosmos DB with provisioned RU/s or a cost-tier-matched storage option (Azure Blob Storage cool tier or Azure Cosmos DB serverless) when access patterns are infrequent and do not demand sustained low-latency throughput.
Whether the stated latency SLA and access pattern (read-heavy, weekly update cadence, unstructured JSON) are fully satisfied by Azure Blob Storage plus Azure CDN edge caching, or whether adding Azure Cache for Redis provides measurable latency benefit that justifies the cache invalidation complexity and operational overhead it introduces.
Whether to configure Azure Cosmos DB with provisioned fixed RU/s throughput or serverless mode when the access pattern is infrequent and batch-only and an explicit per-GB cost ceiling is stated alongside the latency SLA.
Whether to configure Azure Cosmos DB in serverless throughput mode versus manually provisioned RU/s when read access is infrequent and burst-shaped, because provisioned RU/s charges continuously regardless of actual consumption and delivers no additional latency benefit over serverless for sub-ten-queries-per-day access patterns.
Choose Azure Service Bus over Azure Event Hubs when the communication pattern is transactional command messaging requiring per-session FIFO ordering and dead-letter handling, even when throughput is within both services' ranges.
Whether to select a transactional command-message broker with native session ordering and dead-letter support (Service Bus) or a high-scale streaming ingestion platform (Event Hubs) when the workload is a command pattern with variable load and a hard cost-efficiency constraint against idle throughput spend.
Whether Azure API Management's built-in throttling policies, response caching, and backpressure combined with Azure Service Bus for async command decoupling satisfy all throughput, latency, and delivery-guarantee constraints without adding operational surface the team cannot sustain—versus a custom Azure Cache for Redis-backed rate limiter plus Azure Event Grid dispatch that looks performant but forces the team to own cache invalidation sequencing, distributed counter consistency, and event dead-letter handling.
Select the Azure messaging service whose native delivery semantics—session-scoped FIFO, dead-letter queues, at-least-once delivery—satisfy a command-message pattern at 5,000 msg/sec without provisioning a dedicated streaming cluster sized for orders-of-magnitude higher event volumes.
Whether to satisfy the throughput-latency constraint by layering Azure Cache for Redis for read absorption and Azure API Management for throttle-based backpressure, or by over-provisioning dedicated compute nodes that incur a full database round-trip for every request regardless of cache eligibility.
Domain Coverage
Difficulty Breakdown