Post
97
[Empirical Study] DeepSeek's New 1M Context Model: Full-Window Stress Test & Cognitive Emergence
Overview
This post shares an empirical study on DeepSeek's new long-context model (released Feb 2026, web/mobile version), which extends the context window to 1,000,000 tokens.
We conducted a full-window stress test, pushing the limit to ~1.53M tokens, and analyzed the model's behavior across three key dimensions:
Key Findings:
Interaction Token Budget: A complete project lifecycle consumes 1.2M–1.6M tokens, varying by input format and internal sparse attention mechanisms.
Long-Range Recall & Synthesis: The model demonstrates high-fidelity memory across the entire context, capable of retrieving initial instructions and synthesizing comprehensive reports without external RAG.
Emergence of Collaborative Cognition: Beyond a certain threshold, the model shifts from a "Q&A Engine" to a "Cognitive Partner", adopting user reasoning styles and maintaining global coherence—a capability absent in standard 128k windows.
Evidence
The test reached the hard limit at 1,536,000 tokens (see attached screenshot: "Conversation length limit reached").
Resources
Full reports (EN/CN PDFs), source code, and detailed data analysis are open-sourced at:
🔗 Project Page: https://tpwang-lab.github.io
🔗 GitHub Repo: https://github.com/tpwang-lab/deepseek-million-token
Welcome feedback and reproduction attempts from the community!
Tags: #DeepSeek #LLM #LongContext #EmpiricalStudy #AI
Overview
This post shares an empirical study on DeepSeek's new long-context model (released Feb 2026, web/mobile version), which extends the context window to 1,000,000 tokens.
We conducted a full-window stress test, pushing the limit to ~1.53M tokens, and analyzed the model's behavior across three key dimensions:
Key Findings:
Interaction Token Budget: A complete project lifecycle consumes 1.2M–1.6M tokens, varying by input format and internal sparse attention mechanisms.
Long-Range Recall & Synthesis: The model demonstrates high-fidelity memory across the entire context, capable of retrieving initial instructions and synthesizing comprehensive reports without external RAG.
Emergence of Collaborative Cognition: Beyond a certain threshold, the model shifts from a "Q&A Engine" to a "Cognitive Partner", adopting user reasoning styles and maintaining global coherence—a capability absent in standard 128k windows.
Evidence
The test reached the hard limit at 1,536,000 tokens (see attached screenshot: "Conversation length limit reached").
Resources
Full reports (EN/CN PDFs), source code, and detailed data analysis are open-sourced at:
🔗 Project Page: https://tpwang-lab.github.io
🔗 GitHub Repo: https://github.com/tpwang-lab/deepseek-million-token
Welcome feedback and reproduction attempts from the community!
Tags: #DeepSeek #LLM #LongContext #EmpiricalStudy #AI