Multi-Service Tradeoff — AWS AI Practitioner (AIF-C01)
Execution Duration and Model Size Separate Lambda from Containers
Lambda, ECS, EKS, and SQS each appear in AI workload architectures. The constraint that forces a choice is execution duration combined with model artifact size. Lambda suits lightweight, stateless inference under 15 minutes with small model weights loaded from memory. ECS or EKS is required when the workload needs a persistent GPU runtime, larger memory footprint, or custom container images. SQS enters the pattern when inference is asynchronous and producers must decouple from consumers.
What This Pattern Tests
The exam gives you a decoupling requirement and tests whether you pick the right messaging service. SQS is point-to-point with at-least-once delivery (Standard) or exactly-once (FIFO, 3,000 msg/s with batching). SNS is pub/sub fan-out to multiple subscribers. EventBridge is content-based routing with schema registry and 35+ AWS service sources. The trap is choosing SQS for fan-out (use SNS) or SNS for ordered processing (use SQS FIFO). DynamoDB vs. Aurora vs. ElastiCache follows the same pattern: key-value at any scale vs. relational joins vs. microsecond reads from memory.
Decision Axis
Communication pattern (point-to-point vs. fan-out vs. content routing) and data access pattern (key-value vs. relational vs. cache) determine the service.
Associated Traps
Decision Rules
When the task is a well-defined NLP inference operation with a structured output type (entities, sentiment, key phrases), select the purpose-built NLP service over a generative-AI platform that requires additional prompt engineering and introduces unnecessary complexity.
When the stated use case domain (image content moderation) maps exactly to a purpose-built managed AI service capability, prefer that service over building a custom model in a general-purpose ML platform.
When the requirement is real-time text-sentiment classification via a managed, purpose-built AI service that requires no model training, the correct match is the NLP-native service (Amazon Comprehend), not the computer-vision service (Amazon Rekognition) whose data-type domain is images and video.
Whether the stated business domain (fraud detection) maps to a purpose-built AWS AI service that eliminates customer ML model ownership, or whether a general-purpose ML platform is required — determined by whether the customer must retain control over model logic or can delegate domain AI responsibility to AWS.
When the requirement is NLP inference with zero customer-owned model development, the pre-built NLP API (Amazon Comprehend) satisfies the constraint because AWS owns the inference model; the ML platform (Amazon SageMaker AI) fails because the customer must still supply or train the model regardless of managed infrastructure.
When a described business problem falls squarely within the domain of a purpose-built AWS AI service, that service minimizes customer ML responsibilities; choosing SageMaker incorrectly shifts model design, training, and lifecycle ownership onto the customer.
When a supervised NLP inference task maps exactly to a purpose-built service API, should the team select that API on total-cost grounds — absorbing zero training or endpoint cost — rather than choosing a custom ML platform based on a lower-sounding per-invocation rate?
When a stated business problem maps directly to a purpose-built AWS AI service domain (fraud detection), choose that managed service over a custom SageMaker model because total cost of ownership—including model development labor, training compute, and persistent endpoint hosting—exceeds the purpose-built service's per-prediction charge.
When the output-quality symptom is weak or inconsistent step-by-step reasoning and the constraint forbids training or data ingestion, the correct intervention is a prompt engineering technique — specifically chain-of-thought — applied directly in the Amazon Bedrock prompt, not a service-level data or model intervention.
Whether to address inconsistent multi-step reasoning by applying an in-prompt technique (chain-of-thought) within Amazon Bedrock, or by invoking a model-customization or retrieval service that violates the no-training, no-ingestion constraint.
Whether content-enforcement controls on a managed FM service are an AWS-managed default or a customer configuration responsibility requiring the team to explicitly define Guardrails and harden the system prompt.
Whether consistent JSON output format is the customer's responsibility to enforce via system-prompt instructions in the Amazon Bedrock API request, or a platform-managed behavior that Amazon Bedrock provides automatically.
Whether to apply a prompt engineering technique (chain-of-thought prompting) directly in the Bedrock invocation to control reasoning structure at inference time, versus introducing a managed retrieval or fine-tuning service that addresses a different problem dimension, adds infrastructure overhead, and does not resolve the stated reasoning-visibility symptom.
Whether to apply customer-side few-shot or system-prompt persona prompting within Amazon Bedrock to enforce style consistency, versus migrating to a managed assistant service under the mistaken belief that AWS absorbs output-style configuration responsibility.
Whether to reduce hallucinations through a prompt engineering technique applied within the model invocation (e.g., instructing the model to express uncertainty or answer only from context provided in the prompt) versus invoking a service-level data ingestion solution such as RAG via Amazon Kendra.
Whether JSON output format consistency should be enforced through customer-authored prompt instructions within Amazon Bedrock, or delegated to a separate AWS-managed extraction service whose structured outputs are treated as an AWS responsibility.
Whether to apply an in-prompt technique (chain-of-thought prompting) that incurs only additional input-token cost, or a model customization approach (fine-tuning) that appears to be a one-time fixed cost but introduces a separate training-infrastructure billing dimension that violates the no-training-budget constraint.
Whether to apply an in-prompt output-format directive within Amazon Bedrock to constrain FM response length, or to replace the FM invocation path with Amazon Lex to exploit its flat per-request pricing and avoid per-token generation costs.
Whether few-shot prompt examples embedded directly in the Amazon Bedrock invocation can enforce tone and format consistency, versus adopting Amazon Kendra for retrieval-augmented generation—which adds per-query retrieval and index-sync costs that are unjustified when the defect is format adherence rather than factual knowledge gaps.
Whether to suppress hallucinated citations through an in-prompt grounding instruction applied at inference time, or through a retrieval-augmented or model-level service that introduces persistent billing beyond inference tokens.
Domain Coverage
Difficulty Breakdown