1 7 45

webXOS

webxos

https://webxos.netlify.app

AI & ML interests

3D Simulations and Multimodal Synthetics

Recent Activity

liked a model 1 day ago

yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF

liked a model 2 days ago

XiaomiMiMo/MiMo-V2.5-Pro-FP4-DFlash

reacted to mmhamdy's post with 👍 3 days ago

What if you could train a model on just 10 images instead of 60,000 and still get close to the same performance? Traditional machine learning requires thousands, even millions, of data points to achieve high accuracy. But what if we could "distill" the entire dataset into just a few synthetic samples? This is what Dataset Distillation offers. Unlike traditional knowledge distillation, we keep the model fixed and distill the knowledge contained in a massive training set into a tiny set of synthetic distilled images. The goal is to train a model on this ultra-small set and achieve performance that almost matches what the same model would get when trained on the massive original dataset. For example, training on only 10 distilled MNIST images (this is equivalent to a single image per class) yields 94% accuracy, compared to 99% when training on the full 60,000 images. Interestingly, these distilled images look significantly different (as you can see in the image below) from natural images because they are optimized for model training rather than for matching the correct data distribution. But that's not all. Most importantly, this same method opens the door to a potent form of data poisoning. Because distilled images are specifically optimized for rapid learning, an attacker can create a tiny set of adversarial distilled images to cause a well-trained model to forget or misclassify a specific category. What I find fascinating about dataset distillation is this: it mimics human-like learning by letting a model grasp a concept from a single example, but it does so using alien synthetic images that mean absolutely nothing to a human eye! What about you? What are your thoughts on it?

View all activity

Organizations

None yet

liked a model 1 day ago

yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF

Text Generation • 12B • Updated about 14 hours ago • 268k • 1.81k

liked a model 2 days ago

XiaomiMiMo/MiMo-V2.5-Pro-FP4-DFlash

Text Generation • 554B • Updated 11 days ago • 7.36k • 125

reacted to mmhamdy's post with 👍 3 days ago

Post

2570

What if you could train a model on just 10 images instead of 60,000 and still get close to the same performance?

Traditional machine learning requires thousands, even millions, of data points to achieve high accuracy. But what if we could "distill" the entire dataset into just a few synthetic samples?

This is what Dataset Distillation offers. Unlike traditional knowledge distillation, we keep the model fixed and distill the knowledge contained in a massive training set into a tiny set of synthetic distilled images.

The goal is to train a model on this ultra-small set and achieve performance that almost matches what the same model would get when trained on the massive original dataset.

For example, training on only 10 distilled MNIST images (this is equivalent to a single image per class) yields 94% accuracy, compared to 99% when training on the full 60,000 images.

Interestingly, these distilled images look significantly different (as you can see in the image below) from natural images because they are optimized for model training rather than for matching the correct data distribution.

But that's not all.

Most importantly, this same method opens the door to a potent form of data poisoning. Because distilled images are specifically optimized for rapid learning, an attacker can create a tiny set of adversarial distilled images to cause a well-trained model to forget or misclassify a specific category.

What I find fascinating about dataset distillation is this: it mimics human-like learning by letting a model grasp a concept from a single example, but it does so using alien synthetic images that mean absolutely nothing to a human eye!

What about you? What are your thoughts on it?

2 replies

reacted to danielhanchen's post with 🔥 3 days ago

Post

3912

Google's new DiffusionGemma can now run at 2000+ tokens/sec! ⚡

We made local DiffusionGemma inference 1.8× faster.
Run it on 18GB RAM via Unsloth Studio.

GitHub: https://github.com/unslothai/unsloth
Guide: https://unsloth.ai/docs/models/diffusiongemma

4 replies

reacted to kanaria007's post with 👀 3 days ago

Post

178

✅ Article highlight: *Institutional Memory & Forgetting for Learning Worlds* (art-60-172, v0.1)

TL;DR:
This article argues that if a living world becomes training data, memory becomes infrastructure.

Logs, dialogue, labels, releases, feature stores, and model weights can turn a world into something that cannot honestly forget. 172 makes deletion, redaction, exclusion, forgetting requests, SANITIZED/PUBLIC releases, and unlearning claims into receipted governance lifecycles.

Read:
kanaria007/agi-structural-intelligence-protocols

Why it matters:
• prevents learning worlds from becoming “unforgettable worlds”
• separates deletion, redaction, and future extraction exclusion
• makes right-to-be-forgotten requests caseable and appealable
• preserves canon facts without preserving every memory surface
• blocks public promises like “guaranteed deletion everywhere”

What’s inside:
• retention policy contracts for what may be kept, copied, trained on, or released
• corpus segment manifests and propagation indexes for known controlled copies
• forgetting request, adjudication, remedy, deletion, redaction, and exclusion receipts
• tombstone manifests and semantic preservation receipts for canon-safe forgetting
• use eligibility receipts for deciding whether a segment may train a future run
• release contracts, redaction maps, and irreversibility disclosures for SANITIZED/PUBLIC releases
• bounded unlearning contracts and post-unlearning verification receipts

Key idea:
Do not say:

*“we deleted it, so it is forgotten.”*

Say:

*“this subject was handled under this retention policy, propagation index, adjudication path, remedy contract, tombstone, semantic preservation receipt, extraction exclusion receipt, and bounded public claim.”*

Forgetting is not a button.

It is governance with receipts.

reacted to KingNish's post with 👀 3 days ago

Post

4012

We trained an open-source Mythos like cybersecurity LLM for the Build Small Hackathon meet OpenMythos

Trained in two stages: SFT on ~1.84K filtered ArXiv cs.CR papers + real CVE data, then RLVR using paired with past vulnerabilities GitHub repos with a verifier model checking outputs against ground truth.

Trained on: H100s from Modal

The RLVR stage made the biggest difference responses got more precise and less prone to confusing similar vulnerability classes.

Everything is open:
🤖 Demo → build-small-hackathon/OpenMythos
🧠 Model → build-small-hackathon/OpenMythos
📦 CVE Dataset → build-small-hackathon/CVE_Vulnerailities_Detailed
📄 ArXiv Dataset → himanshu17HF/ArvixImport-Filtered-Final

Try it out and let us know where it breaks 🙏