Q81 — AWS AIF-C01 Ch.3

Question 81 of 100 | ← Chapter 3

A company wants to use a large language model (LLM) to learn the language specific to its industry. The company has large volumes of unlabeled data containing industry-specific language. Which solution meets these requirements with the least operational overhead?

Correct Answer: B. Continuing pretraining of the LLM using the company’s data

Explanation

For adapting an LLM to industry-specific language using large volumes of *unlabeled* data, continued pretraining is the most operationally efficient approach. It leverages the company’s existing unlabeled corpus to further train a pretrained LLM, deeply embedding domain-specific vocabulary, syntax, and patterns—without requiring labeled examples or architectural changes. Fine-tuning typically requires labeled data and yields narrower adaptation; prompt-based context injection scales poorly with large corpora and lacks persistent learning; training from scratch is prohibitively expensive and unnecessary. Continued pretraining balances effectiveness and efficiency.