AI Glossary · Last reviewed May 2026

Latency

Hand-written by a real person. Reviewed against current practice in May 2026.

Definition

Time-to-first-token: the delay between sending a request and seeing the first word.

Full write-up coming soon

We are working on a detailed page for Latency - covering why it matters, how it works, related terms, and the tools that use it.

Explore other terms

A program that takes goals and figures out the steps to reac...

The way one piece of software talks to another.

A prompting technique where the model reasons out loud, step...

How much text a model can read at once.

Numeric fingerprints of text or images that let computers me...

Showing a model two to five examples in the prompt so it fol...