First PrinciplesA Clean & Minimal Research JournalSubscribe
← Back to all articles

AI & ML

Small Models, Big Context

The frontier is quietly shifting from parameter count to context length. What a model can hold in mind may matter more than how much it knows.

by Dr. Priya Nair, Machine Learning · June 24, 2026 · 8 min read

Small Models, Big Context

For two years the headline number was parameters. The quieter revolution has been the context window — the span of text, code, or transcript a model can attend to at once — which has grown from a paragraph to a small library.

A modest model with a million-token window behaves unlike its specification suggests. It can read an entire codebase, a quarter's worth of correspondence, or a patient's full chart before answering, substituting retrieval-in-context for knowledge baked into weights.

The trade is real: attention over long sequences is expensive, and most of those tokens are noise. The research frontier is now about which tokens earn their place — learned compression, hierarchical caches, and routing that decides what to keep.

If the last era asked how much a model could memorize, this one asks how much it can hold in working memory at the moment of decision. The answer is reshaping what 'small' even means.

More in AI & ML

View all »