A token is the unit an AI actually reads and writes in. Models do not see words or letters; they see tokens — short fragments of text, roughly ¾ of a word on average. "Newsletter" might be one token; an unusual name might be three. Everything a model does is measured in tokens: how much it can hold, how much it produces, and how much it costs.

That last point is why the term escapes the engineering team. AI pricing is per token, split into input (what you send) and output (what it writes back). When Claude Opus 4.8 lists pricing at "$5 per million input tokens and $25 per million output tokens," that is the real meter running behind every tool built on it.

Why it matters at your desk. For a freelancer paying for API access, tokens are your unit economics: a long document you paste in is input you pay for, and a verbose answer is output you pay for again. Knowing this changes behaviour — you trim the context you send, and you ask for the answer length you need rather than the longest one available.

What to watch for: tokens also explain limits you will hit. The context window is measured in tokens, so "the AI forgot the start of our conversation" usually means the conversation outgrew its token budget — not that the model failed.