Taiping Wang PRO

tpwang199655

AI & ML interests

None yet

Recent Activity

replied to their post about 17 hours ago

[Empirical Study] DeepSeek's New 1M Context Model: Full-Window Stress Test & Cognitive Emergence Overview This post shares an empirical study on DeepSeek's new long-context model (released Feb 2026, web/mobile version), which extends the context window to 1,000,000 tokens. We conducted a full-window stress test, pushing the limit to ~1.53M tokens, and analyzed the model's behavior across three key dimensions: Key Findings: Interaction Token Budget: A complete project lifecycle consumes 1.2M–1.6M tokens, varying by input format and internal sparse attention mechanisms. Long-Range Recall & Synthesis: The model demonstrates high-fidelity memory across the entire context, capable of retrieving initial instructions and synthesizing comprehensive reports without external RAG. Emergence of Collaborative Cognition: Beyond a certain threshold, the model shifts from a "Q&A Engine" to a "Cognitive Partner", adopting user reasoning styles and maintaining global coherence—a capability absent in standard 128k windows. Evidence The test reached the hard limit at 1,536,000 tokens (see attached screenshot: "Conversation length limit reached"). Resources Full reports (EN/CN PDFs), source code, and detailed data analysis are open-sourced at: 🔗 Project Page: https://tpwang-lab.github.io 🔗 GitHub Repo: https://github.com/tpwang-lab/deepseek-million-token Welcome feedback and reproduction attempts from the community! Tags: #DeepSeek #LLM #LongContext #EmpiricalStudy #AI

posted an update about 22 hours ago

updated a Space 1 day ago

tpwang199655/huggingtpwang

View all activity

Organizations

None yet

Posts 2

Post

[Empirical Study] DeepSeek's New 1M Context Model: Full-Window Stress Test & Cognitive Emergence
Overview
This post shares an empirical study on DeepSeek's new long-context model (released Feb 2026, web/mobile version), which extends the context window to 1,000,000 tokens.
We conducted a full-window stress test, pushing the limit to ~1.53M tokens, and analyzed the model's behavior across three key dimensions:
Key Findings:
Interaction Token Budget: A complete project lifecycle consumes 1.2M–1.6M tokens, varying by input format and internal sparse attention mechanisms.
Long-Range Recall & Synthesis: The model demonstrates high-fidelity memory across the entire context, capable of retrieving initial instructions and synthesizing comprehensive reports without external RAG.
Emergence of Collaborative Cognition: Beyond a certain threshold, the model shifts from a "Q&A Engine" to a "Cognitive Partner", adopting user reasoning styles and maintaining global coherence—a capability absent in standard 128k windows.
Evidence
The test reached the hard limit at 1,536,000 tokens (see attached screenshot: "Conversation length limit reached").

Resources
Full reports (EN/CN PDFs), source code, and detailed data analysis are open-sourced at:
🔗 Project Page: https://tpwang-lab.github.io
🔗 GitHub Repo: https://github.com/tpwang-lab/deepseek-million-token
Welcome feedback and reproduction attempts from the community!
Tags: #DeepSeek #LLM #LongContext #EmpiricalStudy #AI

Post

130

Hi , I registered the huggingface pro, paid monthly fee. But still can't post on Blog. Am I missing necessary procedures? Any help will be highly appreciated! Thx in advance.

View all Posts