AI & ML interests

None defined yet.

Recent Activity

Aurelien-Morgan 
posted an update 3 days ago
view post
Post
967
@retrain-pipelines v0.2.0 is out !
I'm at Station F at My booth with GOSIM Paris 2026 today & tomorrow.
Come meet me for a live in-person demo and a chat !
  • 1 reply
·
Aurelien-Morgan 
posted an update 25 days ago
Reubencf 
posted an update 2 months ago
view post
Post
2795
🚀 I am thrilled to announce the release of a new Konkani LLM!

We've seen some fantastic results for both translation and transliteration tasks, and I'm excited to share this progress with the community.

📖 Read the launch article and see the results: https://huggingface.co/blog/Reubencf/konkani-llm
🤖 Explore the model and collection:
konkani


I would love to hear your feedback or see what you build with it! #Konkani #LLM #NLP #HuggingFace #IndicNLP #Konkani
hannayukhymenko 
posted an update 2 months ago
view post
Post
2062
Do you translate your benchmarks from English correctly? 🤔
Turns out, for many languages it is much harder than you can imagine!

Introducing Recovered in Translation 🌍 together with @aalexandrov
https://ritranslation.insait.ai

Translating benchmarks is a painful process, requiring a lot of manual inspection and adjustments. You start from setting up the whole pipeline and adapting to every format type, including task specifics. There already exist some massive benchmarks, but they still have some simple (and sometimes silly) bugs, which can hurt the evaluations :( We present a novel automated translation framework to help with that!

Eastern and Southern European languages introduce richer linguistic structures compared to English and for benchmarks which heavily rely on grammatical coherence machine translation presents a risk of harming evaluations. We discover potential answer leakage or misleading through grammatical structure of the questions. Some benchmarks are also just outdated and need to be retranslated with newer and better models.

We present a framework with novel test-time scaling methods which allow to control time and cost investments, while at the same time mitigate the need for human-in-the-loop verification. While working on Ukrainian-focused MamayLM models, we had to translate 10+ benchmarks in a short span of time. Finding human evaluators is costly and time-consuming, same goes for using professional translators. With our pipeline we were able to do it in 3 days🏎️

We hope our findings will help enable stronger multilingual evaluations and developments. We release all produced benchmarks on Hugging Face together with the source code and Arxiv paper 🤗

Paper: Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets (2602.22207)
Code: https://github.com/insait-institute/ritranslation
Benchmarks: https://huggingface.co/collections/INSAIT-Institute/multilingual-benchmarks
  • 1 reply
·
Reubencf 
posted an update 3 months ago
Reubencf 
posted an update 4 months ago
view post
Post
1901
Now Live: The Reubencf/Nano_Banana_Editor now includes 10 free requests/day! 🍌 I'm personally sponsoring these credits to help make open AI accessible to all.
(Note: Limits are subject to change based on funding).

Enjoy !
takarajordan 
posted an update 4 months ago
view post
Post
213
At takara I'm constantly reading papers, I wonder if anyone can train a model to predict popular papers on our dataset?

takara-ai/daily-papers-popularity
  • 1 reply
·
mmhamdy 
posted an update 4 months ago
view post
Post
3155
The new DeepSeek Engram paper is super fun! It also integrates mHC, and I suspect they're probably releasing all these papers to make the V4 report of reasonable length😄

Here's a nice short summary from Gemini
Reubencf 
posted an update 4 months ago
view post
Post
3230
Happy New Year 2026
i have planned to build many things this year , most of them will be cheaper or free alternative's to paid products

i am looking forward to release some useful spaces ✌️ Stay Tuned !