Q61 — AWS AIF-C01 Ch.1
Question 61 of 100 | ← Chapter 1
A social media company wants to use a large language model (LLM) for content moderation. The company wishes to evaluate LLM outputs to determine whether bias or latent discrimination against specific groups or individuals exists. Which data source should the company use to evaluate LLM outputs?
- A. User-generated content
- B. Moderation logs
- C. Content moderation guidelines
- D. Benchmark datasets ✓
Correct Answer: D. Benchmark datasets
Explanation
This question tests understanding of appropriate data sources for evaluating LLM bias. Benchmark datasets provide standardized, diverse, and representative examples specifically designed to assess fairness, bias, and discrimination across demographic groups. User-generated content (Option A) may lack balance or annotation rigor; moderation logs (Option B) reflect operational history—not ground-truth fairness signals; moderation guidelines (Option C) are policy documents, not empirical evaluation data. Therefore, Option D—benchmark datasets—is the correct and most objective choice for rigorous bias assessment.