Resilience Architecture — Azure Fundamentals (AZ-900)
Availability Zones and Traffic Manager Solve Different Problems
The scenario describes a need for global failover when an Azure region becomes unavailable. Candidates default to Azure Load Balancer because it distributes traffic. But Load Balancer operates within a region — it cannot route across regional boundaries. Traffic Manager performs DNS-based routing across geographically distributed endpoints. When the failure domain is a region rather than a server or availability zone, the answer space shifts entirely.
What This Pattern Tests
Resilience questions test whether you match the availability design to the stated SLA. Over-designing wastes money (multi-region for a 99.9% target). Under-designing risks downtime (single-AZ for a 99.99% target). The exam penalizes both.
Decision Axis
Stated SLA determines resilience architecture. Design to the requirement, not to the maximum.
Associated Traps
More Top Traps on This Exam
Decision Rules
Select Azure Cost Management budgets (active threshold alerting) over Azure Tags (passive metadata labeling) when the stated requirement is proactive notification upon exceeding a defined spend limit.
When the requirement is idempotent, repeatable, drift-preventing infrastructure provisioning, choose ARM Templates (declarative IaC) rather than Azure Arc (hybrid management plane) or imperative tools such as Azure CLI or Azure PowerShell.
Whether the scenario's primary need is proactive best-practice guidance surfaced before issues arise (Azure Advisor) or reactive threshold-based alerting on live resource metrics and logs (Azure Monitor).
Select the Azure cost tool that performs a pre-migration total-cost comparison between on-premises and Azure, not one that estimates Azure-only deployment costs.
Choose Azure Arc when the requirement is to bring existing on-premises or non-Azure servers under Azure management governance; choose ARM Templates when the requirement is declarative idempotent deployment of Azure-hosted resources — these are mutually exclusive scopes.
Does the scenario require proactive best-practice recommendations surfaced before incidents occur (Azure Advisor) or reactive resource-level metric and log alerting triggered after a condition is met (Azure Monitor)?
Whether the scenario requires a cost comparison between on-premises infrastructure and Azure (TCO Calculator) or an estimate of Azure service costs in isolation (Pricing Calculator), determined by the presence of an on-premises baseline and a financial justification requirement.
Whether to use Azure Arc, which extends the Azure management plane to on-premises servers for unified drift-free governance, or ARM Templates, which provides idempotent Azure-native deployments but cannot enroll or govern pre-existing on-premises servers.
Whether Azure Advisor (proactive resource-configuration-based best-practice recommendations) or Azure Service Health (platform-level event and maintenance notification) satisfies a requirement to receive actionable optimization guidance before failures occur.
Select Azure TCO Calculator — not Azure Pricing Calculator — when the requirement is a pre-migration cost comparison between on-premises infrastructure and Azure; TCO Calculator ingests on-premises workload inputs to generate a side-by-side financial business case, whereas Pricing Calculator estimates Azure-only service costs for configurations that are already known and fully defined within Azure.
Choose Azure Arc when the dominant requirement is extending Azure's ongoing management control plane (policy, monitoring, governance) to pre-existing on-premises or multi-cloud servers; choose ARM Templates when the requirement is idempotent declarative deployment of new Azure resources. The two roles are complementary but not interchangeable — Arc provides hybrid-reach governance; ARM Templates provide drift-free provisioning within Azure.
Which Azure monitoring tool proactively evaluates deployed resource configurations and surfaces best-practice reliability recommendations, versus which tools react to platform-level events or metric thresholds only after conditions materialize?
Domain Coverage
Difficulty Breakdown