Azure · AZ-305

Performance Architecture — Azure Solutions Architect (AZ-305)

18%of exam questions (36 of 200)

Three Latency Problems, Three Different Services

Three performance constraints resolve to three different services on AZ-305. 'Reduce latency for globally distributed static asset delivery' points to Azure CDN. 'Low-latency reads for high-frequency hot data' points to Azure Cache for Redis. 'Globally distributed write consistency with low latency' points to Cosmos DB multi-region active-active writes. Front Door adds intelligent global HTTP routing with WAF on top of CDN capability. Redis does not solve geographic distribution of static content, and CDN does not handle mutable application state. Identify which performance layer is in scope before evaluating any service.

What This Pattern Tests

Azure performance architecture tests whether you match the scaling mechanism to the bottleneck. For AI-300, Azure ML managed online endpoints auto-scale based on request latency or CPU, while batch endpoints handle offline scoring at lower cost. Azure OpenAI provisioned throughput units (PTUs) guarantee tokens-per-minute for production workloads, while standard deployment uses shared capacity with rate limits. For AZ-400, Azure Pipelines parallel jobs determine build throughput — free tier gives 1 parallel job with 1,800 minutes/month, while paid agents provide unlimited minutes. Azure Test Plans load testing identifies performance ceilings before deployment. The trap is adding more compute when the bottleneck is API rate limiting, or provisioning PTUs for a dev/test workload that should use standard deployment.

Decision Axis

Performance lever depends on the service: compute scaling vs. throughput provisioning vs. rate limit management vs. data partitioning.

Associated Traps

Over-Provisioning (24)Premium Blob Storage for application logs accessed once a year. Standard Hot with lifecycle policies costs 80% less.Operational Complexity Underestimation (8)Azure VMs with load balancers, auto-scaling rules, and custom monitoring — for a standard web API that App Service handles out of the box.Near-Right Architecture (4)Front Door with session affinity or Traffic Manager with Application Gateway? Both go global — only one handles WebSockets.

Decision Rules

Whether to use Azure Cosmos DB with provisioned RU/s or a cost-tier-matched storage option (Azure Blob Storage cool tier or Azure Cosmos DB serverless) when access patterns are infrequent and do not demand sustained low-latency throughput.

Azure Cosmos DBAzure Blob Storage

Whether the stated latency SLA and access pattern (read-heavy, weekly update cadence, unstructured JSON) are fully satisfied by Azure Blob Storage plus Azure CDN edge caching, or whether adding Azure Cache for Redis provides measurable latency benefit that justifies the cache invalidation complexity and operational overhead it introduces.

Azure Blob StorageAzure CDNAzure Cache for Redis

Whether to configure Azure Cosmos DB with provisioned fixed RU/s throughput or serverless mode when the access pattern is infrequent and batch-only and an explicit per-GB cost ceiling is stated alongside the latency SLA.

Azure Cosmos DBAzure Data Lake Storage

Whether to configure Azure Cosmos DB in serverless throughput mode versus manually provisioned RU/s when read access is infrequent and burst-shaped, because provisioned RU/s charges continuously regardless of actual consumption and delivers no additional latency benefit over serverless for sub-ten-queries-per-day access patterns.

Azure Cosmos DBAzure Blob Storage

Choose Azure Service Bus over Azure Event Hubs when the communication pattern is transactional command messaging requiring per-session FIFO ordering and dead-letter handling, even when throughput is within both services' ranges.

Azure Service BusAzure Event Hubs

Whether to select a transactional command-message broker with native session ordering and dead-letter support (Service Bus) or a high-scale streaming ingestion platform (Event Hubs) when the workload is a command pattern with variable load and a hard cost-efficiency constraint against idle throughput spend.

Azure Service BusAzure Event Hubs

Whether Azure API Management's built-in throttling policies, response caching, and backpressure combined with Azure Service Bus for async command decoupling satisfy all throughput, latency, and delivery-guarantee constraints without adding operational surface the team cannot sustain—versus a custom Azure Cache for Redis-backed rate limiter plus Azure Event Grid dispatch that looks performant but forces the team to own cache invalidation sequencing, distributed counter consistency, and event dead-letter handling.

Azure API ManagementAzure Service BusAzure Cache for Redis

Select the Azure messaging service whose native delivery semantics—session-scoped FIFO, dead-letter queues, at-least-once delivery—satisfy a command-message pattern at 5,000 msg/sec without provisioning a dedicated streaming cluster sized for orders-of-magnitude higher event volumes.

Azure Service BusAzure Event HubsAzure Event Grid

Whether to satisfy the throughput-latency constraint by layering Azure Cache for Redis for read absorption and Azure API Management for throttle-based backpressure, or by over-provisioning dedicated compute nodes that incur a full database round-trip for every request regardless of cache eligibility.

Azure Cache for RedisAzure API ManagementAzure Event Hubs

Domain Coverage

Design Data Storage SolutionsDesign Infrastructure Solutions

Difficulty Breakdown

Medium: 8Hard: 16Expert: 12

Build recognition speed →

See all AZ-305 patterns →