A marketing company uses a large language model (LLM). The company wants to evaluate how the LLM’s response qu

Question

A marketing company uses a large language model (LLM). The company wants to evaluate how the LLM’s response quality changes when minor perturbations are applied to the input in a question-answering task. Which metric should the company use?

Accepted Answer

D. Semantic Robustness

Answer

A. Root Mean Square Error (RMSE)

Answer

B. Area Under the ROC Curve (AUC)

Answer

C. F1 Score

Q64 — AWS AIF-C01 Ch.2

Correct Answer: D. Semantic Robustness

Explanation