A company wants to deploy a language model to create an inference application on edge devices. The inference m

Question

A company wants to deploy a language model to create an inference application on edge devices. The inference must achieve the lowest possible latency. Which solution satisfies these requirements?

Accepted Answer

A. Deploying an optimized small language model (SLM) on the edge device.

Answer

B. Deploying an optimized large language model (LLM) on the edge device.

Answer

C. Integrating a centralized SLM API for asynchronous communication with the edge device.

Answer

D. Integrating a centralized LLM API for asynchronous communication with the edge device.

Q53 — AWS AIF-C01 Ch.1

Correct Answer: A. Deploying an optimized small language model (SLM) on the edge device.

Explanation