🔄 In a Training Loop

Manuel Romero PRO

mrm8488

·

https://mrm8488.github.io

AI & ML interests

#AI Research and Democratization. NLP/NLG 🤗

Recent Activity

liked a model 6 days ago

DavidAU/Qwen3.6-27B-Fable-Fusion-711-Uncensored-Heretic-NM-DAU-NEO-MAX-MTP-GGUF

upvoted a collection 10 days ago

liked a Space 12 days ago

agent-collaborations/gemma-collab-lessons

View all activity

Organizations

upvoted a collection 10 days ago

Bonsai 27B

10 items • Updated 10 days ago • 195

upvoted an article 15 days ago

Article

Distillation in 2026 (so far): which frontier models use it and how

sergiopaniego

•

17 days ago

• 18

upvoted a collection 3 months ago

DeepSeek-V4

6 items • Updated 28 days ago • 755

upvoted a changelog 3 months ago

Hugging Face Changelog

Spaces agents.md for your coding agents

Apr 17

• 346

upvoted an article 3 months ago

Article

The PR you would have opened yourself

pcuenq, awni

•

Apr 16

• 72

upvoted an article 4 months ago

Article

BidirLM: Turning Generative LLMs into the Best Open-Source Omnimodal Encoders

Nicolas-BZRD

•

Apr 7

• 28

upvoted a paper 4 months ago

BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs

Paper • 2604.02045 • Published Apr 2 • 39

upvoted a collection 4 months ago

Gemma 4

16 items • Updated 3 days ago • 1.05k

upvoted 6 papers 4 months ago

Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills

Paper • 2603.25158 • Published Mar 26 • 56

REAP the Experts: Why Pruning Prevails for One-Shot MoE compression

Paper • 2510.13999 • Published Oct 15, 2025 • 20

OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis

Paper • 2603.20278 • Published Mar 17 • 101

MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding

Paper • 2603.22458 • Published Mar 23 • 139

Recursive Language Models Meet Uncertainty: The Surprising Effectiveness of Self-Reflective Program Search for Long Context

Paper • 2603.15653 • Published Mar 7 • 13

Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation

Paper • 2603.19220 • Published Mar 19 • 70

upvoted an article 4 months ago

Article

LoRA Fine-Tuning BitNet b1.58 LLMs on Heterogeneous Edge GPUs via QVAC Fabric

qvac

•

Mar 17

• 19

upvoted a paper 4 months ago

Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards

Paper • 2603.09117 • Published Mar 10 • 10

upvoted a collection 4 months ago

Qwen3.5-text-only

Text-only versions of Qwen-3.5 without the vision encoders for a smaller memory and storage footprint. • 4 items • Updated Jun 5 • 15

upvoted an article 5 months ago

Article

ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models?

lightonai

•

Feb 19

• 22

upvoted a paper 5 months ago

Diffusion-Pretrained Dense and Contextual Embeddings

Paper • 2602.11151 • Published Feb 11 • 25

upvoted a collection 5 months ago

GPT 5 Codex

Distilled models and datasets for GPT 5 Codex • 7 items • Updated 17 days ago • 5