AI Glossary · Last reviewed May 2026
Latency
Hand-written by a real person. Reviewed against current practice in May 2026.
"
Definition
Time-to-first-token: the delay between sending a request and seeing the first word.
Full write-up coming soon
We are working on a detailed page for Latency - covering why it matters, how it works, related terms, and the tools that use it.
Explore other terms
From the glossaryAI Agents
A program that takes goals and figures out the steps to reac...
API
The way one piece of software talks to another.
Chain of Thought
A prompting technique where the model reasons out loud, step...
Context Window
How much text a model can read at once.
Embeddings
Numeric fingerprints of text or images that let computers me...
Few-shot Learning
Showing a model two to five examples in the prompt so it fol...