Q81 — AWS AIF-C01 Ch.3
Question 81 of 100 | ← Chapter 3
A company wants to use a large language model (LLM) to learn the language specific to its industry. The company has large volumes of unlabeled data containing industry-specific language. Which solution meets these requirements with the least operational overhead?
- A. Fine-tuning the LLM using the company’s data
- B. Continuing pretraining of the LLM using the company’s data ✓
- C. Training a new LLM from scratch using the company’s data
- D. Providing the company’s data as context in the LLM’s prompts
Correct Answer: B. Continuing pretraining of the LLM using the company’s data
Explanation
For adapting an LLM to industry-specific language using large volumes of *unlabeled* data, continued pretraining is the most operationally efficient approach. It leverages the company’s existing unlabeled corpus to further train a pretrained LLM, deeply embedding domain-specific vocabulary, syntax, and patterns—without requiring labeled examples or architectural changes. Fine-tuning typically requires labeled data and yields narrower adaptation; prompt-based context injection scales poorly with large corpora and lacks persistent learning; training from scratch is prohibitively expensive and unnecessary. Continued pretraining balances effectiveness and efficiency.