AI & ML interests

None defined yet.

Recent Activity

JonnaMat  published a model 34 minutes ago
embedl/Cosmos-Reason2-2B-NVFP4A16
JonnaMat  updated a model about 4 hours ago
embedl/Cosmos-Reason2-2B-NVFP4A16
JonnaMat  new activity about 5 hours ago
embedl/Cosmos-Reason2-2B-NVFP4A16:Add dtypes
View all activity

Organization Card

Embedl

Embedl Organization Banner

Embedl develops advanced tools and algorithms for Edge AI. Our mission is to make AI models run faster, more energy-efficient, and reliably across diverse hardware platforms, while significantly reducing development time.

We help teams deploy high-performance AI on real-world, resource-constrained devices.

Embedl Models (Community)

Pre-optimized models that can be used off-the-shelf or customized for specific hardware target supported by the embedl-models package.

First release highlights:

  • The fastest Small Language Models (SLMs) using FlashHead, a novel architectural improvement to the language-model head
  • Works with popular models like Llama, Gemma, and Qwen
  • Provides speedups on top of:
    • Quantization
    • Flash Attention
    • Other standard optimizations

Device: Nvidia Jetson Thor

Model Generation speed (tokens/s)
embedl/Llama-3.2-3B-Instruct-FlashHead-W4A16 100
Llama-3.2-3B-Instruct-W4A16* 80
RedHatAI/Llama-3.2-3B-Instruct-FP8 64
meta-llama/Llama-3.2-3B-Instruct 37

*Embedl quantized model for benchmarking similar to the FlashHead-W4A16 but without the faster FlashHead and custom generation loop.


Contact

Headquarters (Sweden)
Gamla Almedalsvägen 39
412 63 Gothenburg, Sweden

Email: contact@embedl.com