Jaward Sesay's picture

Jaward Sesay

Jaward

·

https://github.com/Jaykef

AI & ML interests

Building Lectūra Labs | CS Grad Student @BIT | AI/ML Research: Autonomous Agents, LLMs | Building The Cursor for Learning | Role Model Karpathy

Recent Activity

liked a model 3 days ago

CohereLabs/cohere-transcribe-03-2026

posted an update 9 days ago

Supercool! You can now easily train a JEPA world model (15M params) from end-to-end on a single GPU, with planning done under 1s 🤯. - trained with classic prediction loss + SIGReg. - plans purely in raw pixels. - beats SOTA DINO-WM and PLDM. - single hyper-parameter with no heuristics. - fully open sourced!! Paper/Code/Data: https://le-wm.github.io/

posted an update 16 days ago

Kimi team dropped a major improvement to the transformer architecture and it quietly targets one of the most taken-for-granted components: residual connections. For nearly a decade, transformers (since introduction) have relied on residuals that simply add all previous layer outputs equally. It works but it’s also kind of… dumb. Kimi’s new paper, “Attention Residuals (AttnRes)”, replaces that with something much more intelligent: → instead of blindly summing past layers, → it learns which layers matter, → and dynamically weight contributions across depth. So attention is no longer just over tokens…it’s now also over layers (depth). This means effectively turning depth into a dynamic memory system, phenomenal!

View all activity

Organizations

Jaward 's Spaces 3

Professor AI Feynman

Generate lecture materials and audio using AI

Optimus

Generate speech and translate audio using AI models

Seamless Speech Translator