I hope we can learn to agree in the future that anything with a degree of confidence is trying at an art rather than pure science, but that it will highlight how much of our work was art anyway.
Joseph Robert Turcotte PRO
AI & ML interests
Recent Activity
Organizations
It shouldn't have to explain to you why it's better. Better should be a rather floaty standard, as in Stable Diffusion. In general, better means better in all metrics, but if you need specific metrics, it should know to keep all betterment that fits rather general standards better. I don't think it does. If it can know it's better, it shouldn't have to explain that it's done that, no matter if you prefer black boxes or not. And this is how it goes: Electronics engineers feed models their goals, and they produce circuits in shapes they can't explain, but that work. This is happening now. Most of our science is of this type, where we simply accept that things fit without knowing fully why.
What happens when you tell something like Stable Diffusion to build a human hand without any advice on what actually composes a human hand and a bunch of biomimetic parts with materials science to choose from? When does it invent a better hand accidentally?
Article: https://robonine.com/increasing-the-structural-rigidity-of-the-manipulator/
If I spend enough time, I should be able make a bot from scratch that's 20 times smaller, using the same code and structure. Why am I saying this? Am I going to do it? No, but everyone should know that most of what people do is throwing a ton of information at bots to work out for themselves algorithmically. We need more experimental bots, particularly to skip a few steps toward getting the same answer. So, I'm always glad to see work of this sort, whether it's trying different datasets with different LLMs, or whatever.
Repo: raincandy-u/Rain-100M
Data: HuggingFaceFW/fineweb-edu, ~3B tokens, English only
Tokenizer: custom 16k BPE, context length 4096
Architecture: 12 Transformer layers, hidden size 768, 12 heads, MLP 2048, SiLU, bf16
Rain-100M is a raw base model (not instruction-tuned or safety-aligned), aimed at small-scale research, debugging training pipelines, and CPU/edge experiments. If you run evaluations, finetunes, or visualizations with it, I would be very interested in your results!
The visual effects of this model are simply beyond imagination itβs every bit as good as NanoBanana, no compromise at all.
I fine-tuned my micro-scene prompts by adding text overlays and background effects, and its adaptability is truly breathtaking. With just one prompt, you can generate scene posters for any movie or novel.
Every detail, from scene design to text style and atmospheric effects, perfectly aligns with the tone of the original material.
No forced elements, just seamless, film-grade visual effects that exactly match what I envisioned.
π Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct
- Guardpoint is our new medical reasoning model; trained on medical knowledge, management, diagnosis, and tasks from DeepSeek-V3.2-Speciale!
- Structured medical reasoning responses are efficient and informative, cutting token costs for faster inference!
- Wide-ranging knowledge base: trained on a wide variety of medical disciplines, patient types, and query structures!
- High quality medical responses emphasize performance, brevity, specificity, statistical rationality, and openness.
Get it now:
Guardpoint for gpt-oss-120b: ValiantLabs/gpt-oss-120b-Guardpoint
Guardpoint for gpt-oss-20b: ValiantLabs/gpt-oss-20b-Guardpoint
Powered by our new structured medical reasoning dataset: sequelbox/Superpotion-DeepSeek-V3.2-Speciale
Guardpoint is also available for Qwen 3:
Guardpoint for Qwen 3 32B: ValiantLabs/Qwen3-32B-Guardpoint
Guardpoint for Qwen 3 14B: ValiantLabs/Qwen3-14B-Guardpoint
We've been working hard on Guardpoint; we're really excited to share it with everyone! It's also our best finetune so far for gpt-oss. Try it out and see what you think!
We'll be bringing Guardpoint, Shining Valiant, and Esper to more models soon, along with further experimental releases. We're planning to do a lot with Deepseek's upcoming release; it should unlock a lot of new possibilities for specialist and experimental models!
Get our experimental models: https://huggingface.co/collections/sequelbox/experimental-reasoning-models
Get our reasoning datasets: https://huggingface.co/collections/sequelbox/reasoning-datasets
Help support our releases, donations used for our experimental models and datasets: sequelbox/SupportOpenSource
Fight for open source with us!
love,
allegra
When does the planned context become the signifier of that context in the code itself? When something is stable in code. Even having to recover, or being able to, means it's storing far too much about context without getting to the context itself. All language needs the same simplification. Or maybe I just don't see reflexivity in AI yet. Maybe I don't see it building itself with awareness of what it is to others, unlike NASNet.
The "Janus Interface" paper details a new attack that could recover forgotten PII through fine-tuning APIs. This is a solution-oriented paper because it highlights a problem that needs fixing.
Testing such a high-stakes attack requires equally high-stakes data. The Ai4Privacy 300k dataset was a key part of their evaluation, providing a testbed for extracting sensitive Social Security Numbers. Our dataset, with its synthetic structured SSN data, helped the researchers at Indiana University, Stanford & CISPA, and others demonstrate that their attack works on more than just emails. It could affect highly sensitive personal identifiers.
We're excited to see our open-source dataset used in such cutting-edge security research. It's a win for the community when researchers can use our resources to stress-test the safety of modern AI systems. This work is a direct and explicit call for stronger protections on fine-tuning interfaces.
π This is why open data for security research is so important. Check out the full paper: https://arxiv.org/pdf/2310.15469
π Stay updated on the latest in privacy-preserving AIβfollow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/
I wrote a deep dive into how Magic AI's 100M token context window might work, starting from their HashHop benchmark and building up to MALM - a Memory-Augmented Language Model.
Key insight: treating each key as a single token enables perfect retrieval at unlimited context lengths.
The article covers:
- How HashHop works and why its perfect accuracy is suspicious
- Building a tokenized solver that achieves 100% accuracy
- Scaling to MALM for real code search tasks
- Why this approach could handle 100M+ tokens
Read the full article: https://huggingface.co/blog/codelion/reverse-engineering-magic-hashhop
Try the model: codelion/malm-165m
Code: https://github.com/codelion/hash-hop
Why it had to be done π
PyTorch's Dynamo compiler is increasingly becoming the default interoperability layer for ML systems. Anything that relies on torch.export or torch.compile, from model optimization to cross-framework integrations, benefits directly when models can be captured as a single dynamo-traced graph !
Transformers models are now easier to:
βοΈ Compile end-to-end with torch.compile backends
π¦ Export reliably via torch.export and torch.onnx.export
π Deploy to ONNX / ONNX Runtime, Intel Corporation's OpenVINO, NVIDIA AutoDeploy (TRT-LLM), AMD's Quark, Meta's Executorch and more hardware-specific runtimes.
This work aims at unblocking entire TorchDynamo-based toolchains that rely on exporting Transformers across runtimes and accelerators.
We are doubling down on Transformers commitment to be a first-class citizen of the PyTorch ecosystem, more exportable, more optimizable, and easier to deploy everywhere.
There are definitely some edge-cases that we still haven't addressed so don't hesitate to try compiling / exporting your favorite transformers and to open issues / PRs.
PR in the comments ! More updates coming coming soon !
To try it out, just run
npx -y open-responses init (or uvx) and that's it! :)Would love feedback and support for adding local HF models, @akhaliq @bartowski @prithivMLmods @julien-c @clefourrier @philschmid
Weβd love feedback from the Hugging Face community on how it integrates with your pipelines (support for Hugging Face models landing soon!). Letβs push open-source AI forward together!
Docs:
https://docs.julep.ai/responses/quickstart
Repo:
https://github.com/julep-ai/open-responses
agents-sdk:
https://platform.openai.com/docs/guides/agents
Excited to share my Looped-GPT blog post and codebase π
https://github.com/sanyalsunny111/Looped-GPT
TL;DR: looping during pre-training improves generalization.
Plot shows GPT2 LMs pre-trained with 15.73B OWT tokens
P.S. This is my first post here β I have ~4 followers and zero expectations for reach π
Introducing LoongFlow: A Thinking & Learning Framework for Expert-Grade AI Agents.
Unlike traditional evolve agents(like OpenEvolve-Style), LoongFlow implements the PES (Plan-Execute-Summary) paradigm to learn from mistakes and avoid local optima.
π Highlights:
* SOTA: Surpassed human mathematicians on 11 geometry/algebra problems.
* 23 Kaggle Gold Medals on MLE Bench.
* Efficiency: 60% more efficient than current baselines.
π Code & Paper:
https://github.com/baidu-baige/LoongFlow
LoongFlow: Directed Evolutionary Search via a Cognitive Plan-Execute-Summarize Paradigm (2512.24077)
#AutoML #Kaggle #Agents #OpenSource #LLM
HF Space: alexnasa/ltx-2-TURBO
NovaSR can
- Enhance TTS model quality.
- Restore poor quality datasets.
- Work on any device(just 52kb which is smaller than a 3 second audio file!)
Model: YatharthS/NovaSR
Space to try it: YatharthS/NovaSR
Github repo: https://github.com/ysharma3501/NovaSR
We're highlighting a solution-oriented report from researchers Sahana Naganandh, Vaibhav V, and Thenmozhi M at Vellore Institute of Technology that investigates this exact challenge. The direct connection to our mission is clear: the paper showcases the PII43K dataset as a privacy-preserving alternative to high-risk, raw multilingual data
The report notes that our dataset, with its structured anonymization, is a "useful option for privacy-centric AI applications." It's always a delight when academic research independently validates our data-first approach to solving real-world privacy problems.
This is how we build a safer AI future together.
π Read the full report here to learn more: https://assets.cureusjournals.com/artifacts/upload/technical_report/pdf/3689/20250724-59151-93w9ar.pdf
π Stay updated on the latest in privacy-preserving AIβfollow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/
#OpenSource
#DataPrivacy
#LLM
#Anonymization
#AIsecurity
#HuggingFace
#Ai4Privacy
#Worldslargestopensourceprivacymaskingdataset
This negligence is terrifyingly evident when you look at the current landscape. Take Qwen Image 2512, for example; while it delivers undeniably strong performance, it has incredibly weak guardrails that make it dangerous to deploy. In stark contrast, Z Image might not get as much hype for its power, but it has much better safety guardrails than Qwen Image 2512.
It is imperative that the open-source community and developers recognize that capability without responsibility is a liability. We must actively work on protecting these models from bad actors who seek to exploit them for malicious purposes, such as generating disinformation, creating non-consensual imagery, or automating cyberattacks. It is no longer enough to simply release a powerful model; we must build layers of defense that make it resistant to jailbreaking and adversarial attacks. Developers need to prioritize alignment and robust filtering techniques just as much as they prioritize benchmark scores. We cannot hand such potent tools to the world without ensuring they have the safety mechanisms to prevent them from being turned against us.
https://huggingface.co/davanstrien/iconclass-vlm: Qwen2.5-VL-3B trained using SFT to generate ICONCLASS codes (think Dewey Decimal for art!)
Trained with TRL + HF Jobs - single UV script, no GPU needed!
Space to explore predictions on a test set: davanstrien/iconclass-predictions
Blog soon!