Q9 — AWS AIF-C01 Ch.3
Question 9 of 100 | ← Chapter 3
A data scientist is using Amazon SageMaker to conduct text generation experiments with a large language model (LLM). The data scientist wants to evaluate whether the model exhibits bias related to gender, age, or race in its responses. Which type of evaluation satisfies these requirements?
- A. Factual knowledge
- B. Prompt stereotyping ✓
- C. Toxicity
- D. Semantic robustness
Correct Answer: B. Prompt stereotyping
Explanation
Prompt stereotyping evaluation measures whether model outputs reinforce harmful societal stereotypes or biases based on protected attributes like gender, age, or race—precisely addressing the need to detect biased behavior in LLM responses.