Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
83.7
TFLOPS
61
38
448
David Golchinfar
PRO
DavidGF
Follow
mlabonne's profile picture
clem's profile picture
MicSta's profile picture
65 followers
·
47 following
https://vago-solutions.ai
DavidGFar
dgolchin
AI & ML interests
finetune llms, improve german language understanding and generated text of llms
Recent Activity
liked
a model
4 days ago
DataScience-UIBK/Reason-mxbai-colbert-v0-32m
reacted
to
anakin87
's
post
with ❤️
4 days ago
A small model that struggled against a random opponent now beats GPT-5-mini at tic-tac-toe I took https://huggingface.co/LiquidAI/LFM2-2.6B and trained it through play. 🧑🍳 Here's how: 1️⃣ Build a solid RL env with Verifiers (Prime Intellect) 2️⃣ Generate synthetic data: <200 games sampled from GPT-5-mini playing in the env 3️⃣ SFT warm-up to teach format 4️⃣ Group-based RL (CISPO) against opponents making 20-70% random moves 5️⃣ RL again with stronger opponents (0-25% random moves) + 1.25 temperature to push exploration and shake off suboptimal strategies Done! Beats GPT-5-mini 🏆 --- 🎮 Play against the model: https://huggingface.co/spaces/anakin87/LFM2-2.6B-mr-tictactoe 🤗 Model: https://huggingface.co/anakin87/LFM2-2.6B-mr-tictactoe 📚 Walkthrough/course: https://github.com/anakin87/llm-rl-environments-lil-course 🤗 Dataset and checkpoints: https://huggingface.co/collections/anakin87/lfm2-26b-mr-tic-tac-toe
liked
a Space
4 days ago
anakin87/LFM2-2.6B-mr-tictactoe
View all activity
Organizations
DavidGF
's models
1
Sort: Recently updated
DavidGF/SauerkrautTTS-Preview-0.1-Q8_0-GGUF
3B
•
Updated
Apr 2, 2025
•
16