AWS · AIF-C01

Performance Architecture — AWS AI Practitioner (AIF-C01)

14%of exam questions (17 of 125)

Caching Inference Results Is Not the Same as Edge Delivery

A candidate sees "reduce latency" and defaults to CloudFront. The scenario specifies repeated identical inference requests originating from the same application tier—not geographic distribution of static content. ElastiCache solves this by caching inference outputs server-side, eliminating redundant model invocations entirely. CloudFront accelerates delivery across geographic distance. The distinction is where the latency originates: network path versus repeated compute. Identical inputs are a caching signal, not a CDN signal.

What This Pattern Tests

The exam presents a performance requirement and tests architectural pattern selection. For ML workloads like MLS-C01, SageMaker endpoint auto-scaling adjusts instance count based on InvocationsPerInstance metrics, while multi-model endpoints share a single endpoint across models to reduce cost. For AIF-C01, Bedrock provisioned throughput reserves model capacity for predictable latency, while on-demand throughput works for variable workloads. For data engineering on DEA-C01, Glue job performance depends on DPU allocation — too few DPUs bottleneck Spark shuffles, too many waste money on small datasets. Redshift Serverless scales RPUs automatically, while provisioned clusters need manual resize. The trap is scaling compute when the bottleneck is data shuffling, or provisioning throughput for a bursty workload that should use on-demand.

Decision Axis

Bottleneck identification before scaling: compute-bound = more instances/DPUs, I/O-bound = better partitioning, latency-bound = caching or provisioned capacity.

Associated Traps

Service Confusion (17)Kinesis Data Streams vs. Kinesis Data Firehose — one word apart, completely different delivery models.

More Top Traps on This Exam

Shared Responsibility Confusion · 20%You patched the OS on RDS — except AWS does that. You skipped patching on EC2 — except that is your job.Pricing Misconception · 11%Reserved Instances at 40% savings — except the workload runs 29% of the time, so you lose money.

Decision Rules

Whether to use SageMaker Model Monitor—which operates at the ML pipeline monitoring stage and compares live inference data against a trained baseline—versus a general-purpose infrastructure monitoring service that cannot perform statistical drift comparison against an ML baseline.

Amazon SageMaker AIAmazon CloudWatch

Select RAG via Amazon Bedrock over fine-tuning via Amazon SageMaker AI when the scenario combines a sub-second latency constraint with a daily data-freshness requirement that retraining cycles cannot satisfy.

Amazon BedrockAmazon SageMaker AI

Whether to apply ROUGE (recall-oriented, measures coverage of reference content) or BLEU (precision-oriented, measures n-gram overlap with reference) as the evaluation metric for a summarisation task, where capturing key source content is the business objective.

Amazon BedrockAmazon SageMaker AI

Whether to ground the FM in fresh domain data at inference time via RAG (Amazon Bedrock) versus baking knowledge into model weights via fine-tuning (Amazon SageMaker AI), when the knowledge source updates daily and retraining overhead would violate cost and latency constraints.

Amazon BedrockAmazon SageMaker AI

Whether the explainability requirement demands automated per-prediction feature attribution (SageMaker Clarify / SHAP) versus human-in-the-loop review routing (A2I) — the 'automated' and 'feature-level' constraints together disqualify A2I.

Amazon SageMaker AIAmazon Augmented AI (Amazon A2I)

Whether to apply automated feature attribution tooling (SageMaker Clarify) or route predictions through a human review workflow (Amazon A2I) when the constraint requires machine-generated, prediction-level explainability delivered to auditors at inference time.

Amazon SageMaker AIAmazon Augmented AI (Amazon A2I)

Whether the explainability requirement demands automated, prediction-level feature attribution at inference time (SageMaker Clarify) or a human-in-the-loop review routing workflow (Amazon A2I); the deciding cue is 'feature-level explanations to auditors' generated automatically, not manual human inspection.

Amazon SageMaker AIAmazon Augmented AI (Amazon A2I)

Whether the explainability requirement is satisfied by automated machine-generated feature attribution (SageMaker Clarify / SHAP) or by a human review workflow (A2I), given the auditor constraint that demands quantitative, per-prediction justification without a human in the loop.

Amazon SageMaker AIAmazon Augmented AI (Amazon A2I)

Domain Coverage

Fundamentals of AI and MLApplications of Foundation ModelsGuidelines for Responsible AI

Difficulty Breakdown

Easy: 13Medium: 2Hard: 2

Build recognition speed →

See all AIF-C01 patterns →