The context window is how much an AI can pay attention to at one time — its working memory. Measured in tokens, it covers everything in play at once: your instructions, the documents you have pasted, and the conversation so far. Anything beyond the window's edge falls out of view.

This is the single concept that explains the most everyday AI frustrations. "It forgot what I told it earlier," "it ignored half the document," "it contradicted itself on a long thread" — all are usually the context window filling up, not the model being careless.

Why it matters at your desk. For a lawyer feeding in a 200-page agreement or an engineer pointing a model at a large codebase, the context window decides whether the AI can actually consider the whole thing or only a slice. Tools like Claude Projects and Cursor work hard to manage this — pulling in the relevant parts rather than everything — and each model generation, including Opus 4.8, pushes the limit higher.

What to watch for: a bigger window is not a memory upgrade between sessions — start a new chat and the window is empty again. And filling a huge window with everything you have is not free or always wise; relevant, well-chosen context beats a wall of text, which is the whole logic behind retrieval-augmented generation.