4 8

Lapo Luchini

lapo

https://lapo.it/

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance

reacted to matteospanio's post with 🚀 16 days ago

🎶 Released mule-torch — an unofficial PyTorch port of MULE (SF-NFNet-F0), SiriusXM/Pandora's music-audio embedding model (McCallum et al., ISMIR 2022). No retraining: I re-implemented the architecture in pure PyTorch and transferred the original TensorFlow weights, then checked it layer by layer against the genuine TF pipeline. ✅ End-to-end clip-embedding cosine 0.9999999 vs the original ✅ ONNX backbone parity < 1e-6 ✅ 62.35M params (paper: ~62.4M) ✅ Batched, GPU-native, ONNX-exportable — none of which the original `Analysis` pipeline does ```python pip install mule-torch ``` ```python from mule_torch import MuleModel emb = MuleModel.from_pretrained()(waveform) # (B, T)@16kHz -> (B, 1728) ``` 🤗 Weights: https://huggingface.co/matteospanio/mule 💻 Code: https://github.com/matteospanio/mule-torch 📦 PyPI: https://pypi.org/project/mule-torch/ The fun bug: parity was perfect through every conv but the block output was anti-correlated (cos = −1). Cause: the learnable skip-init gains couldn't be mapped by layer name (Keras scrambles the order) — they had to be recovered from the graph. ⚠️ Unofficial, community port — not affiliated with or endorsed by the original authors. All credit to them; please cite the paper. Weights inherit CC-BY-NC-4.0.

reacted to HannesVonEssen's post with 👍 about 2 months ago

📣 I made a visualizer for Hugging Face models: https://hfviewer.com ✨ Simply paste a Hugging Face URL to get an interactive visualization of the architecture! 🔗 The recent Qwen3.6-27B model as an example: https://hfviewer.com/Qwen/Qwen3.6-27B Feel free to try it out and give me feedback on how it can be improved! ❤️

View all activity

Organizations

None yet

upvoted a paper 1 day ago

Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance

Paper • 2606.19195 • Published 7 days ago • 127

reacted to matteospanio's post with 🚀 16 days ago

Post

7051

pip install mule-torch

from mule_torch import MuleModel
emb = MuleModel.from_pretrained()(waveform)   # (B, T)@16kHz -> (B, 1728)

🤗 Weights: matteospanio/mule
💻 Code: https://github.com/matteospanio/mule-torch
📦 PyPI: https://pypi.org/project/mule-torch/

The fun bug: parity was perfect through every conv but the block output was anti-correlated (cos = −1). Cause: the learnable skip-init gains couldn't be mapped by layer name (Keras scrambles the order) — they had to be recovered from the graph.

⚠️ Unofficial, community port — not affiliated with or endorsed by the original authors. All credit to them; please cite the paper. Weights inherit CC-BY-NC-4.0.

reacted to HannesVonEssen's post with 👍 about 2 months ago

Post

246

📣 I made a visualizer for Hugging Face models: https://hfviewer.com

✨ Simply paste a Hugging Face URL to get an interactive visualization of the architecture!

🔗 The recent Qwen3.6-27B model as an example: https://hfviewer.com/Qwen/Qwen3.6-27B

Feel free to try it out and give me feedback on how it can be improved! ❤️

1 reply

reacted to OzTianlu's post with 👍 3 months ago

Post

5414

Arcade-3B — SmolReasoner
NoesisLab/Arcade-3B
Arcade-3B is a 3B instruction-following and reasoning model built on SmolLM3-3B. It is the public release from the ARCADE project at NoesisLab, which investigates the State–Constraint Orthogonality Hypothesis: standard Transformer hidden states conflate factual content and reasoning structure in the same subspace, and explicitly decoupling them improves generalization.

5 replies

liked 2 Spaces 3 months ago

Arena Leaderboard

🏆

4.93k

View the LMArena leaderboard in full‑screen

Leaderboard of Leaderboards

🔥

Real-time rankings of the most trusted leaderboard

upvoted an article 3 months ago

Article

🏟️ Smol AI WorldCup: A 5-Axis Benchmark That Reveals What Small Language Models Can Really Do

FINAL-Bench

•

Mar 10

• 38

reacted to SeaWolf-AI's post with 👍 4 months ago

Post

6135

ALL Bench Leaderboard — Structural Problems in AI Benchmarking and the Case for Unified Evaluation

FINAL-Bench/all-bench-leaderboard

The AI benchmark ecosystem has three structural problems. Major benchmarks like MMLU have surpassed 90%, losing discriminative power. Most leaderboards publish unverified self-reported scores — our cross-verification found Claude Opus 4.6's ARC-AGI-2 listed as 37.6% (actual: 68.8%), Gemini 3.1 Pro as 88.1% (actual: 77.1%). OpenAI's own audit confirmed 59.4% of SWE-bench Verified tasks are defective, yet it remains widely used.

ALL Bench addresses this by comparing 91 models across 6 modalities (LLM · VLM · Agent · Image · Video · Music) with 3-tier confidence badges (✓✓ cross-verified · ✓ single-source · ~ self-reported). Composite scoring uses a 5-Axis Framework and replaces SWE-Verified with contamination-resistant LiveCodeBench.

Key finding: metacognition is the largest blind spot. FINAL Bench shows Error Recovery explains 94.8% of self-correction variance, yet only 9 of 42 models are even measured. The 9.2-point spread (Kimi K2.5: 68.71 → rank 9: 59.5) is 3× the GPQA top-model spread, suggesting metacognition may be the single biggest differentiator among frontier models today.

VLM cross-verification revealed rank reversals — Claude Opus 4.6 leads MMMU-Pro (85.1%) while Gemini 3 Flash leads MMMU (87.6%), producing contradictory rankings between the two benchmarks.

📊 Article: https://huggingface.co/blog/FINAL-Bench/all-bench
📦 Dataset: FINAL-Bench/ALL-Bench-Leaderboard
⚡ GitHub: https://github.com/final-bench/ALL-Bench-Leaderboard
🏆 Leaderboard: FINAL-Bench/all-bench-leaderboard
🧬 FINAL Bench: FINAL-Bench/Metacognitive

liked a model 4 months ago

NoesisLab/Collins-Embedding-3M

reacted to OzTianlu's post with 👍 4 months ago

Post

1975

We deleted the Embedding Layer -- INTRO Our Collins-Embedding-3M
NoesisLab/Collins-Embedding-3M
Most "small" models are just giant vocab tables in a trench coat. Collins-3M changes that. By using 2-Universal Hashing and Chernoff-bound noise suppression, we’ve collapsed the embedding space into a fixed O(1) hash-map.
* STSB: 0.7114 (Beating many 100M+ models)
* Size: 3M (Edge-ready, IoT-ready)
* Tech: Randomized Sign-Hashing + RoPE positional injection.
Built by NoesisLab