Taiping Wang PRO

tpwang199655

AI & ML interests

None yet

Recent Activity

replied to their post about 20 hours ago

[Empirical Study] DeepSeek's New 1M Context Model: Full-Window Stress Test & Cognitive Emergence Overview This post shares an empirical study on DeepSeek's new long-context model (released Feb 2026, web/mobile version), which extends the context window to 1,000,000 tokens. We conducted a full-window stress test, pushing the limit to ~1.53M tokens, and analyzed the model's behavior across three key dimensions: Key Findings: Interaction Token Budget: A complete project lifecycle consumes 1.2M–1.6M tokens, varying by input format and internal sparse attention mechanisms. Long-Range Recall & Synthesis: The model demonstrates high-fidelity memory across the entire context, capable of retrieving initial instructions and synthesizing comprehensive reports without external RAG. Emergence of Collaborative Cognition: Beyond a certain threshold, the model shifts from a "Q&A Engine" to a "Cognitive Partner", adopting user reasoning styles and maintaining global coherence—a capability absent in standard 128k windows. Evidence The test reached the hard limit at 1,536,000 tokens (see attached screenshot: "Conversation length limit reached"). Resources Full reports (EN/CN PDFs), source code, and detailed data analysis are open-sourced at: 🔗 Project Page: https://tpwang-lab.github.io 🔗 GitHub Repo: https://github.com/tpwang-lab/deepseek-million-token Welcome feedback and reproduction attempts from the community! Tags: #DeepSeek #LLM #LongContext #EmpiricalStudy #AI

posted an update 1 day ago

updated a Space 1 day ago

tpwang199655/huggingtpwang

View all activity

Organizations

None yet

replied to their post about 20 hours ago

"Thanks for the kind words! Really appreciate you taking the time to read through.
You hit the nail on the head: the shift from 'tool' to 'cognitive partner' is exactly what surprised us most during the 1.5M token run.
Looking forward to your feedback if you manage to reproduce any part of it. Feel free to open an issue on the GitHub repo if you hit any snags!"

posted an update 1 day ago

Post

131

[Empirical Study] DeepSeek's New 1M Context Model: Full-Window Stress Test & Cognitive Emergence
Overview
This post shares an empirical study on DeepSeek's new long-context model (released Feb 2026, web/mobile version), which extends the context window to 1,000,000 tokens.
We conducted a full-window stress test, pushing the limit to ~1.53M tokens, and analyzed the model's behavior across three key dimensions:
Key Findings:
Interaction Token Budget: A complete project lifecycle consumes 1.2M–1.6M tokens, varying by input format and internal sparse attention mechanisms.
Long-Range Recall & Synthesis: The model demonstrates high-fidelity memory across the entire context, capable of retrieving initial instructions and synthesizing comprehensive reports without external RAG.
Emergence of Collaborative Cognition: Beyond a certain threshold, the model shifts from a "Q&A Engine" to a "Cognitive Partner", adopting user reasoning styles and maintaining global coherence—a capability absent in standard 128k windows.
Evidence
The test reached the hard limit at 1,536,000 tokens (see attached screenshot: "Conversation length limit reached").

Resources
Full reports (EN/CN PDFs), source code, and detailed data analysis are open-sourced at:
🔗 Project Page: https://tpwang-lab.github.io
🔗 GitHub Repo: https://github.com/tpwang-lab/deepseek-million-token
Welcome feedback and reproduction attempts from the community!
Tags: #DeepSeek #LLM #LongContext #EmpiricalStudy #AI

3 replies

replied to their post 3 days ago

Thanks. Problem solved.

posted an update 4 days ago

Post

133

Hi , I registered the huggingface pro, paid monthly fee. But still can't post on Blog. Am I missing necessary procedures? Any help will be highly appreciated! Thx in advance.

2 replies

Taiping Wang PRO

AI & ML interests

Recent Activity

Organizations

tpwang199655's activity