Cost Blind Spot — AWS Machine Learning (MLS-C01)
The architecturally correct answer was also the most expensive. The exam wanted the cost-optimized option that still meets requirements.
Multi-model endpoints aren't free because they're clever.
Hosting dozens of tenant-specific models is a real problem. The distractor provisions a dedicated real-time endpoint per model — maximum isolation, straightforward routing, full performance headroom per tenant. The architecture is clean and the exam respects it. What it ignores is that idle endpoints in a multi-tenant SaaS context generate continuous instance charges whether they serve a request or not. Multi-model endpoints, or SageMaker Serverless Inference, are the cost-aware answer the scenario's scale and utilization pattern demands.
The Scenario
The question describes a video transcoding pipeline processing uploaded files — fault-tolerant, no user-facing latency requirements, files reprocessable on failure. You choose a Multi-AZ ECS cluster with On-Demand Fargate tasks and auto-scaling. The correct answer uses EC2 Spot Instances in an Auto Scaling group with a Spot Fleet diversified across instance types and AZs. Same throughput, 60-90% less cost. The workload is explicitly fault-tolerant (files can be reprocessed), which is the textbook Spot qualification. The exam said "most cost-effective" and you optimized for availability that the scenario never required.
How to Spot It
- •When the question says "cost-effective" or "minimize cost," check whether the workload is fault-tolerant. Batch processing, media transcoding, CI/CD builds, data analysis, and any workload with "reprocessable on failure" are Spot Instance candidates. Spot saves 60-90% over On-Demand.
- •Multi-AZ deployments, provisioned IOPS, and dedicated hosts all add cost. If the scenario does not mention an SLA, uptime requirement, or "highly available," these features are cost traps the exam uses to test whether you add unnecessary resilience.
- •S3 Intelligent-Tiering adds a $0.0025/1000 objects monitoring fee. For billions of small objects, that monitoring fee exceeds the storage savings. The exam tests whether you know when Intelligent-Tiering costs more than just picking the right tier manually.
Decision Rules
Whether to use Amazon Comprehend's managed API for commodity sentiment and key-phrase extraction or build and host a custom SageMaker NLP model, given the cost-per-inference mandate, standard English text, and zero dedicated ML operations capacity.
Domain Coverage
Difficulty Breakdown
Related Patterns