7 12 212

Lucas Chen

leocnj

AI & ML interests

NLP, speech, multimodal, deep learning

Recent Activity

upvoted an article 13 days ago

Efficient LLM Pretraining: Packed Sequences and Masked Attention

liked a model 17 days ago

google/functiongemma-270m-it

upvoted an article 4 months ago

You could have designed state of the art positional encoding

View all activity

Organizations

None yet

upvoted an article 13 days ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

Oct 7, 2024

•

liked a model 17 days ago

google/functiongemma-270m-it

Text Generation • Updated Jan 14 • 37.4k • 932

upvoted an article 4 months ago

Article

You could have designed state of the art positional encoding

Nov 25, 2024

•

454

liked 2 Spaces 5 months ago

The Smol Training Playbook

📚

3.05k

The secrets to building world-class LLMs

The Ultra-Scale Playbook

🌌

3.74k

The ultimate guide to training LLM on large GPU Clusters

liked a model 6 months ago

deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27, 2025 • 326k • • 3.09k

liked 5 datasets 7 months ago

liked a Space 8 months ago

FLUX.1 Krea Dev

📚

370

Generate images from text prompts

liked 5 datasets 8 months ago

interstellarninja/hermes_reasoning_tool_use

Viewer • Updated Dec 26, 2025 • 51k • 475 • 158

nvidia/Llama-Nemotron-Post-Training-Dataset

Viewer • Updated May 8, 2025 • 3.91M • 2.88k • 644

MathAndMagic/function-calling

Viewer • Updated Feb 2, 2024 • 86.9k • 914 • 9

BluebrainAI/Function_calling_SFT

Viewer • Updated Mar 24, 2025 • 348k • 25 • 3

quotientai/limbic-eval-tool-use-mcp

Viewer • Updated 20 days ago • 9.81k • 106 • 13

liked a model 8 months ago

quotientai/limbic-tool-use-0.5B-32K

Text Generation • 0.5B • Updated Jul 23, 2025 • 12 • 25

upvoted an article 8 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8, 2025

•

764

liked a dataset 9 months ago

MadeAgents/xlam-irrelevance-7.5k

Viewer • Updated Oct 10, 2024 • 7.5k • 238 • 21

Lucas Chen

AI & ML interests

Recent Activity

Organizations

leocnj's activity

Efficient LLM Pretraining: Packed Sequences and Masked Attention

You could have designed state of the art positional encoding

The Smol Training Playbook

The Ultra-Scale Playbook

FLUX.1 Krea Dev

SmolLM3: smol, multilingual, long-context reasoner