7 26 65

web

dim

dmitrymailk

AI & ML interests

dimweb, LM/LLM pronouns

Recent Activity

published a model about 1 month ago

dim/sid-klein-lora-gan-patch-lpips-sid-anchor-20x-v4-step-0004500

updated a model about 1 month ago

dim/sid-klein-lora-gan-patch-lpips-sid-anchor-20x-v4-step-0004500

liked a Space about 2 months ago

carlofkl/DreamLite

View all activity

Organizations

upvoted a paper 3 months ago

GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent

Paper • 2603.13875 • Published Mar 14 • 36

upvoted an article 5 months ago

Article

Training Design for Text-to-Image Models: Lessons from Ablations

Photoroom

•

Feb 3

• 74

upvoted a paper 6 months ago

SIMA 2: A Generalist Embodied Agent for Virtual Worlds

Paper • 2512.04797 • Published Dec 4, 2025 • 25

upvoted 2 papers 7 months ago

Wikontic: Constructing Wikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models

Paper • 2512.00590 • Published Nov 29, 2025 • 52

Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

Paper • 2511.14993 • Published Nov 19, 2025 • 234

upvoted a paper 8 months ago

When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA

Paper • 2510.04849 • Published Oct 6, 2025 • 117

upvoted a paper 9 months ago

Optimal Scaling Needs Optimal Norm

Paper • 2510.03871 • Published Oct 4, 2025 • 30

upvoted a paper 10 months ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 109

upvoted an article 11 months ago

Article

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

smohammadi, siro1, winglian, marcsun13, djsaunde

•

Aug 8, 2025

• 99

upvoted 2 articles 12 months ago

Article

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

thomwolf, matthieu-lapeyre

•

Jul 9, 2025

• 803

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 780

upvoted 3 papers about 1 year ago

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28, 2025 • 47

TULIP: Towards Unified Language-Image Pretraining

Paper • 2503.15485 • Published Mar 19, 2025 • 49

PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

Paper • 2504.08791 • Published Apr 7, 2025 • 141

upvoted an article over 1 year ago

Article

FastRTC: The Real-Time Communication Library for Python

freddyaboulton, abidlabs

•

Feb 25, 2025

• 172

upvoted a paper over 1 year ago

Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity

Paper • 2502.13063 • Published Feb 18, 2025 • 74

upvoted an article over 1 year ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

eliebak, lvwerra, lewtun

•

Jan 28, 2025

• 889

upvoted 3 papers almost 2 years ago

Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models

Paper • 2407.12327 • Published Jul 17, 2024 • 79

Associative Recurrent Memory Transformer

Paper • 2407.04841 • Published Jul 5, 2024 • 35

Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task

Paper • 2406.14213 • Published Jun 20, 2024 • 21

web

AI & ML interests

Recent Activity

Organizations

dim's activity

Training Design for Text-to-Image Models: Lessons from Ablations

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

SmolLM3: smol, multilingual, long-context reasoner

FastRTC: The Real-Time Communication Library for Python

Open-R1: a fully open reproduction of DeepSeek-R1