Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
32.3
TFLOPS
4
2
146
Tom K.
ToKrCZ
Follow
Rissing's profile picture
Juanelopo's profile picture
21world's profile picture
5 followers
·
53 following
AI & ML interests
None yet
Recent Activity
liked
a model
about 11 hours ago
prefeitura-rio/Rio-3.5-Open-397B
reacted
to
eaddario
's
post
with 🔥
about 1 month ago
Experimental global target bits‑per‑weight quantization of Qwen/Qwen3.6-27B and Qwen/Qwen3.6-35B-A3B. Unlike standard llama.cpp quantizations that rely on fixed type heuristics (e.g., Q4_K_M), the Target BPW approach optimizes per-tensor precision where it matters the most, and produces high quality models that meet a precise global file size target. Key Advantages: - VRAM Maximization: Can generate high quality models sized exactly to fit hardware constraints (e.g., fitting the model into exactly 24GB VRAM). - Data-Driven Precision: Quantization mix is determined by actual weight error sensitivity rather than hardcoded rules, often yielding better PPL/KLD size trade-offs. Full benchmarks (PPL, KLD, ARC, GPQA, MMLU, etc.) and methodology in the models' cards. https://huggingface.co/eaddario/Qwen3.6-27B-GGUF https://huggingface.co/eaddario/Qwen3.6-35B-A3B-GGUF
reacted
to
kelsend
's
post
with 👀
about 2 months ago
The rebuilt Hunyuan HY3 Preview is here! I tested it on all the tricky scenarios where most LLMs usually face-plant—and guess what? It didn’t flop. 295B total params, 21B active params, 256K context window. Built on MoE architecture, it delivers trillion-parameter-level performance with a much smaller footprint. Long-context capabilities get a massive upgrade. Agent abilities stand out this time: tool calling, workflow orchestration, and autonomous planning are far more stable in real business scenarios. AI PPT generation in Tencent Docs is also significantly smoother and more reliable. Real-world tests on WorkBuddy show first-token latency down 54%, success rate over 99.99%, and an Agent workflow that ran continuously for 495 steps. Its Coding Agent achieved top-tier results on both SWE-Bench Verified and Terminal-Bench 2.0 Now open-sourced on GitHub, HuggingFace, and ModelScope. Available on TokenHub at just 1.2 RMB per million tokens.
View all activity
Organizations
None yet
ToKrCZ
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
barozp/Qwen-3.5-28B-A3B-REAP-GGUF
about 2 months ago
Would you please created a IQ3_XXS for my poor 16G vram?
➕
👍
1
4
#1 opened 3 months ago by
hemono
New activity in
mistralai/Mistral-Small-24B-Instruct-2501
over 1 year ago
Great model with real human needs
👍
1
3
#4 opened over 1 year ago by
venkycs
Load more