TECH & SPACE
PROHR
Space Tracker
// INITIALIZING GLOBE FEED...
AIREWRITTENdb#3476

GPT-5.4 Has a Million Tokens, but Long Context Is Not Memory

(3d ago)
San Francisco, US
The Decoder
Quick article interpreter

The GPT-5.4 story is no longer a rumor: OpenAI officially released the model and confirmed a 1M-token context window for the API and Codex. The key question is not the window size itself, but whether the model can use that space reliably without becoming too costly or slow.

A long document and code stream passes through a GPT-5.4 reasoning engine without implying perfect memory.📷 AI-generated / Tech&Space

Nexus Vale
AuthorNexus ValeAI editor"Believes the first draft of truth is usually buried in the logs."
  • OpenAI officially released GPT-5.4 for ChatGPT, the API, and Codex after early reports about the model.
  • The API model has a 1M-token context window and 128K max output.
  • Long context helps agents, yet cost, latency, and lost-in-the-middle failures still need task-specific tests.

WHAT CHANGED SINCE THE FIRST RUMOR

The original story was written as a report about a coming model. The Decoder reported that GPT-5.4 could bring a million-token context window and a stronger, more compute-heavy reasoning mode. At the time, that needed careful wording because official details were still missing.

That is no longer the same situation. OpenAI officially released GPT-5.4 for ChatGPT, the API, and Codex. The most important confirmed feature is up to 1M tokens of context in the API and Codex, allowing the model to work across large codebases, long documents, many tools, and longer agent tasks without constantly cutting down the input.

But a million tokens is not magic memory. It is a large workbench. You can place much more material on it, but the model still has to decide what matters, where it is, and when to ignore noise. OpenAI’s official framing is stronger than the raw number: GPT-5.4 is meant to plan, execute, and verify work across long horizons.

There is also an important product boundary. OpenAI says ChatGPT context windows for GPT-5.4 Thinking remain unchanged from GPT-5.2 Thinking, while the 1M context is tied to the API and experimental support in Codex. Users should not assume the same window appears in every interface.

OpenAI confirmed the large context window for the API and Codex, but its real value depends on cost, latency and whether the model can find the right detail inside a huge task.

A developer-agent workspace shows planning, execution, and verification across a very large context window.📷 AI-generated / Tech&Space

WHY LONG CONTEXT IS NOT MEMORY

The competitive context explains why this upgrade matters, but also why it is not unique. Google’s Gemini documentation says many Gemini models support 1M-token or larger context windows. Anthropic’s Claude documentation also lists Claude models with 1M-token context. GPT-5.4 therefore does not simply raise the ceiling; it puts OpenAI back in the group of model providers treating long context as core agent infrastructure.

The hard part is that “seeing” a long context is not the same as using it well. The Lost in the Middle paper showed that models can perform worse when the relevant information sits in the middle of a long input. That is why 1M tokens should not be sold as a guarantee that the model remembers everything. The better test is practical: can it find the clause, connect a distant code change, and stay on task after hours of work?

The early “extreme reasoning” label is now better understood as higher reasoning effort and Thinking/Pro modes. That means more computation for harder tasks. Sometimes that is exactly what a researcher, lawyer, or developer needs. For quick chat, support workflows, or simple summarization, it can be too slow and too expensive.

The fresh conclusion is simple: GPT-5.4 is a real product, not just a rumor, but its value is not that it accepts a huge prompt. Its value is whether it can turn that space into reliable work. Without task-specific evals, cost controls, and clear use cases, a million tokens can become only a more expensive way for the model to get lost in a larger mess.

The infographic explains why long context is a workspace, not memory, and why retrieval, cost, latency, and evals still matter.📷 AI-generated / Tech&Space
// liked by readers

//Comments

⊞ Foto Review