Tom K.

ToKrCZ

4 2 146

AI & ML interests

None yet

Recent Activity

liked a model 25 days ago

DuoNeural/AdQWENistrator-9B

reacted to loleg's post with 🤗 25 days ago

Thank you Hugging Face team for some very helpful and quick support today. Greetings from the AI for Good summit in Geneva!

liked a model about 2 months ago

prefeitura-rio/Rio-3.5-Open-397B

View all activity

Organizations

None yet

liked a model 25 days ago

DuoNeural/AdQWENistrator-9B

Text Generation • 9B • Updated Apr 29 • 88 • 5

reacted to loleg's post with 🤗 25 days ago

Post

1415

Thank you Hugging Face team for some very helpful and quick support today. Greetings from the AI for Good summit in Geneva!

1 reply

liked a model about 2 months ago

prefeitura-rio/Rio-3.5-Open-397B

Image-Text-to-Text • 403B • Updated 9 days ago • 173 • 327

reacted to eaddario's post with 🔥 3 months ago

Post

3315

Experimental global target bits‑per‑weight quantization of Qwen/Qwen3.6-27B and Qwen/Qwen3.6-35B-A3B.

Unlike standard llama.cpp quantizations that rely on fixed type heuristics (e.g., Q4_K_M), the Target BPW approach optimizes per-tensor precision where it matters the most, and produces high quality models that meet a precise global file size target.

Key Advantages:
- VRAM Maximization: Can generate high quality models sized exactly to fit hardware constraints (e.g., fitting the model into exactly 24GB VRAM).
- Data-Driven Precision: Quantization mix is determined by actual weight error sensitivity rather than hardcoded rules, often yielding better PPL/KLD size trade-offs.

Full benchmarks (PPL, KLD, ARC, GPQA, MMLU, etc.) and methodology in the models' cards.

eaddario/Qwen3.6-27B-GGUF
eaddario/Qwen3.6-35B-A3B-GGUF

reacted to kelsend's post with 👀 3 months ago

Post

3559

The rebuilt Hunyuan HY3 Preview is here!

I tested it on all the tricky scenarios where most LLMs usually face-plant—and guess what? It didn’t flop.

295B total params, 21B active params, 256K context window. Built on MoE architecture, it delivers trillion-parameter-level performance with a much smaller footprint. Long-context capabilities get a massive upgrade.

Agent abilities stand out this time: tool calling, workflow orchestration, and autonomous planning are far more stable in real business scenarios. AI PPT generation in Tencent Docs is also significantly smoother and more reliable.

Real-world tests on WorkBuddy show first-token latency down 54%, success rate over 99.99%, and an Agent workflow that ran continuously for 495 steps.

Its Coding Agent achieved top-tier results on both SWE-Bench Verified and Terminal-Bench 2.0

Now open-sourced on GitHub, HuggingFace, and ModelScope. Available on TokenHub at just 1.2 RMB per million tokens.

liked a model 3 months ago

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 1.6T • Updated Jun 22 • 1.6M • • 5.37k

liked 2 models 4 months ago

#1 opened 4 months ago by

hemono

liked a model 4 months ago

0xSero/Qwen3.6-28B

Text Generation • 28B • Updated May 30 • 200 • 34

reacted to SeaWolf-AI's post with 🔥 4 months ago

Post

4476

Darwin-TTS: 3% of an LLM's Brain Makes TTS Speak with Emotion — Zero Training

We blended 3% of Qwen3-1.7B (LLM) FFN weights into Qwen3-TTS-1.7B's talker module. The result: emotionally enhanced speech synthesis — with zero training, zero data, and zero GPU hours.

Try the Demo: FINAL-Bench/Darwin-TTS-1.7B-Cross

Model Weights: FINAL-Bench/Darwin-TTS-1.7B-Cross

Full Research Article: https://huggingface.co/blog/FINAL-Bench/darwin-tts

Qwen3-1.7B (LLM) and Qwen3-TTS-1.7B's talker share 100% identical architecture — same hidden_size (2048), same layers (28), same heads (16). This enabled pure 1:1 weight blending across 84 FFN tensors with a single lerp operation. At 3% blend, emotion appears. At 5%, emotion intensifies. At 10%, the model breaks — producing 655-second outputs for a 3-second sentence, because the LLM's "keep generating" pattern overwhelms the TTS stop signal.

To our knowledge, this is the first training-free cross-modal weight transfer between an LLM and a TTS model. Prior work either requires adapter training (SmolTolk, 2025), fine-tuning (CSLM, 2025), or massive end-to-end compute (GPT-4o). Darwin-TTS achieves cross-modal capability transfer in under 2 minutes on CPU.

The key insight: TTS models with LLM backbones already "think" in language. We're just restoring 3% of the original LLM's language understanding patterns — particularly those related to emotional semantics and prosody planning. The code is three lines: load the model, load the LLM FFN, call p.lerp_(llm_weight, 0.03).

creators of the Darwin Evolutionary Merge Framework.
Darwin LLM V7 achieved GPQA Diamond 86.9% (HF Benchmark #3)
through CMA-ES optimized FFN crossbreeding. Darwin-TTS extends this principle from LLM-to-LLM merging into cross-modal LLM-to-TTS transfer. Apache 2.0.