Sdeerk

12 10

AI & ML interests

None yet

Recent Activity

upvoted an article 29 days ago

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

upvoted an article 4 months ago

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

upvoted an article 4 months ago

Ulysses Sequence Parallelism: Training with Million-Token Contexts

View all activity

Organizations

None yet

upvoted an article 29 days ago

Article

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

ariG23498, sayakpaul, sergiopaniego, ror, pcuenq

•

May 29

• 153

upvoted 2 articles 4 months ago

Article

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

•

Jan 27

• 80

Article

Ulysses Sequence Parallelism: Training with Million-Token Contexts

kashif, stas

•

Mar 9

• 32

liked a Space 6 months ago

The Smol Training Playbook

📚

3.25k

The secrets to building world-class LLMs

upvoted an article 8 months ago

Article

Continuous batching from first principles

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 425

liked a Space 9 months ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.4k

Explore and download the FineWeb web‑scale text dataset

liked a model 9 months ago

PaddlePaddle/PaddleOCR-VL

Image-Text-to-Text • 1.0B • Updated Jun 27 • 48.1k • 1.64k

liked a model 11 months ago

baidu/ERNIE-4.5-21B-A3B-Thinking

Text Generation • 22B • Updated Nov 26, 2025 • 15k • 788

upvoted an article 11 months ago

Article

Vision Language Models (Better, faster, stronger)

merve, sergiopaniego, ariG23498, pcuenq, andito

•

May 12, 2025

• 614

liked 2 datasets 12 months ago

Jofthomas/hermes-function-calling-thinking-V1

Viewer • Updated Feb 16, 2025 • 3.57k • 622 • 79

NousResearch/hermes-function-calling-v1

Viewer • Updated Jan 3 • 11.6k • 26.7k • 433

upvoted a paper about 1 year ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 320

liked a Space about 1 year ago

Awesome O1 R1

💻

[Keep updating]Collect everything about o1 and r1!

upvoted 2 articles about 1 year ago

Article

Mixture of Experts Explained

osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq

•

Dec 11, 2023

• 1.16k

Article

Vision Language Models Explained

merve, edbeeching

•

Apr 11, 2024

• 542

updated a model about 1 year ago

baidu/ERNIE-4.5-21B-A3B-Base-Paddle

Text Generation • 22B • Updated Aug 20, 2025 • 41 • 16

liked a Space about 1 year ago

The Ultra-Scale Playbook

🌌

3.96k

The ultimate guide to training LLM on large GPU Clusters

liked a dataset about 1 year ago

openai/gsm8k

Benchmark • Updated Mar 23 • 17.6k • 950k • 1.47k

upvoted a collection about 1 year ago

ERNIE 4.5

Collection

collection of ERNIE 4.5 models. • 27 items • Updated Nov 11, 2025 • 190

updated a model about 1 year ago

baidu/ERNIE-4.5-21B-A3B-Paddle

Text Generation • 22B • Updated Sep 9, 2025 • 60 • 19

Sdeerk

AI & ML interests

Recent Activity

Organizations

Sdeerk's activity

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

Ulysses Sequence Parallelism: Training with Million-Token Contexts

The Smol Training Playbook

Continuous batching from first principles

FineWeb: decanting the web for the finest text data at scale

Vision Language Models (Better, faster, stronger)

Awesome O1 R1

Mixture of Experts Explained

Vision Language Models Explained

The Ultra-Scale Playbook