Taiping Wang's picture
1 1

Taiping Wang PRO

tpwang199655
·

AI & ML interests

None yet

Recent Activity

replied to their post about 17 hours ago
[Empirical Study] DeepSeek's New 1M Context Model: Full-Window Stress Test & Cognitive Emergence Overview This post shares an empirical study on DeepSeek's new long-context model (released Feb 2026, web/mobile version), which extends the context window to 1,000,000 tokens. We conducted a full-window stress test, pushing the limit to ~1.53M tokens, and analyzed the model's behavior across three key dimensions: Key Findings: Interaction Token Budget: A complete project lifecycle consumes 1.2M–1.6M tokens, varying by input format and internal sparse attention mechanisms. Long-Range Recall & Synthesis: The model demonstrates high-fidelity memory across the entire context, capable of retrieving initial instructions and synthesizing comprehensive reports without external RAG. Emergence of Collaborative Cognition: Beyond a certain threshold, the model shifts from a "Q&A Engine" to a "Cognitive Partner", adopting user reasoning styles and maintaining global coherence—a capability absent in standard 128k windows. Evidence The test reached the hard limit at 1,536,000 tokens (see attached screenshot: "Conversation length limit reached"). Resources Full reports (EN/CN PDFs), source code, and detailed data analysis are open-sourced at: 🔗 Project Page: https://tpwang-lab.github.io 🔗 GitHub Repo: https://github.com/tpwang-lab/deepseek-million-token Welcome feedback and reproduction attempts from the community! Tags: #DeepSeek #LLM #LongContext #EmpiricalStudy #AI
posted an update about 22 hours ago
[Empirical Study] DeepSeek's New 1M Context Model: Full-Window Stress Test & Cognitive Emergence Overview This post shares an empirical study on DeepSeek's new long-context model (released Feb 2026, web/mobile version), which extends the context window to 1,000,000 tokens. We conducted a full-window stress test, pushing the limit to ~1.53M tokens, and analyzed the model's behavior across three key dimensions: Key Findings: Interaction Token Budget: A complete project lifecycle consumes 1.2M–1.6M tokens, varying by input format and internal sparse attention mechanisms. Long-Range Recall & Synthesis: The model demonstrates high-fidelity memory across the entire context, capable of retrieving initial instructions and synthesizing comprehensive reports without external RAG. Emergence of Collaborative Cognition: Beyond a certain threshold, the model shifts from a "Q&A Engine" to a "Cognitive Partner", adopting user reasoning styles and maintaining global coherence—a capability absent in standard 128k windows. Evidence The test reached the hard limit at 1,536,000 tokens (see attached screenshot: "Conversation length limit reached"). Resources Full reports (EN/CN PDFs), source code, and detailed data analysis are open-sourced at: 🔗 Project Page: https://tpwang-lab.github.io 🔗 GitHub Repo: https://github.com/tpwang-lab/deepseek-million-token Welcome feedback and reproduction attempts from the community! Tags: #DeepSeek #LLM #LongContext #EmpiricalStudy #AI
updated a Space 1 day ago
tpwang199655/huggingtpwang
View all activity

Organizations

None yet