A company wants to use a language model to create an application for inference on edge devices. The inference

Question

A company wants to use a language model to create an application for inference on edge devices. The inference must have the lowest possible latency. Which solution meets these requirements?

Accepted Answer

A. Deploy an optimized small language model (SLM) on the edge device.

Answer

B. Deploy an optimized large language model (LLM) on the edge device.

Answer

C. Integrate a centralized small language model (SLM) API for asynchronous communication with the edge device.

Answer

D. Integrate a centralized large language model (LLM) API for asynchronous communication with the edge device.

Q82 — AWS AIF-C01 Ch.1

Correct Answer: A. Deploy an optimized small language model (SLM) on the edge device.

Explanation