Ruslan's picture

Ruslan

uzvisa

·

AI & ML interests

None yet

Recent Activity

new activity about 14 hours ago

Qwen/Qwen3.6-35B-A3B:how to enable non-thinking mode of this model in llama.cpp?

reacted to eaddario's post with 👍 3 days ago

Experimental global target bits‑per‑weight quantization of Qwen/Qwen3.6-27B and Qwen/Qwen3.6-35B-A3B. Unlike standard llama.cpp quantizations that rely on fixed type heuristics (e.g., Q4_K_M), the Target BPW approach optimizes per-tensor precision where it matters the most, and produces high quality models that meet a precise global file size target. Key Advantages: - VRAM Maximization: Can generate high quality models sized exactly to fit hardware constraints (e.g., fitting the model into exactly 24GB VRAM). - Data-Driven Precision: Quantization mix is determined by actual weight error sensitivity rather than hardcoded rules, often yielding better PPL/KLD size trade-offs. Full benchmarks (PPL, KLD, ARC, GPQA, MMLU, etc.) and methodology in the models' cards. https://huggingface.co/eaddario/Qwen3.6-27B-GGUF https://huggingface.co/eaddario/Qwen3.6-35B-A3B-GGUF

reacted to eaddario's post with 🔥 3 days ago

Experimental global target bits‑per‑weight quantization of Qwen/Qwen3.6-27B and Qwen/Qwen3.6-35B-A3B. Unlike standard llama.cpp quantizations that rely on fixed type heuristics (e.g., Q4_K_M), the Target BPW approach optimizes per-tensor precision where it matters the most, and produces high quality models that meet a precise global file size target. Key Advantages: - VRAM Maximization: Can generate high quality models sized exactly to fit hardware constraints (e.g., fitting the model into exactly 24GB VRAM). - Data-Driven Precision: Quantization mix is determined by actual weight error sensitivity rather than hardcoded rules, often yielding better PPL/KLD size trade-offs. Full benchmarks (PPL, KLD, ARC, GPQA, MMLU, etc.) and methodology in the models' cards. https://huggingface.co/eaddario/Qwen3.6-27B-GGUF https://huggingface.co/eaddario/Qwen3.6-35B-A3B-GGUF

View all activity

Organizations

None yet

New activity in Qwen/Qwen3.6-35B-A3B about 14 hours ago

how to enable non-thinking mode of this model in llama.cpp?

#54 opened about 18 hours ago by

New activity in ai-sage/GigaChat3-10B-A1.8B 3 months ago

Failed to parse Jinja template: Unexpected token: UnaryOperator

#9 opened 3 months ago by

New activity in tiiuae/Falcon-H1-7B-Instruct-GGUF 3 months ago

you forget a model 12B...16B

#1 opened 10 months ago by

New activity in ZeroXClem/Qwen3-4B-Sky-High-Hermes 3 months ago

Very good model!

#1 opened 3 months ago by

New activity in ZeroXClem/Gemma3-4B-Arceus-Servant 3 months ago

Can you make a new model?

#1 opened 3 months ago by

New activity in ByteDance-Seed/Stable-DiffCoder-8B-Instruct 3 months ago

Issues running your model in LM Studio

#2 opened 3 months ago by

New activity in ExaltedSlayer/gemma-3-12b-it-qat-mlx-mxfp4 5 months ago

its work!

#1 opened 5 months ago by

New activity in ExaltedSlayer/gemma-3-27b-it-qat-mlx-mxfp4 5 months ago

Thank you!

#1 opened 6 months ago by

New activity in deepcogito/README 8 months ago

I’m excited to hear any updates from the DeepCogito Team!

#1 opened about 1 year ago by

New activity in ByteDance-Seed/Seed-Coder-8B-Instruct 12 months ago

Отличная модель!

#2 opened 12 months ago by

New activity in bartowski/ServiceNow-AI_Apriel-Nemotron-15b-Thinker-GGUF 12 months ago

Improved Jinja Chat Template for Apriel-Nemotron-15b-Thinker GGUF (e.g., for LM Studio)

#1 opened 12 months ago by

New activity in kiriyk/seo_qwen2.5_8epochs-Q8_0-GGUF about 1 year ago

Какие данные в этой LLM

#1 opened about 1 year ago by

New activity in RichardErkhov/micks99_-_gemma-2b-instruct-ft-Data-Analytics-Digital-Marketing-Project-Management-QAv2-gguf about 1 year ago

Для чего нужна эта модель?

#1 opened about 1 year ago by

Для чего нужна эта модель?

#1 opened about 1 year ago by

Для чего нужна эта модель?

#1 opened about 1 year ago by