Spaces:

embedl
/

README

Running

App Files Files Community

JonnaMat commited on 7 days ago

Commit

34aa77b

verified ·

1 Parent(s): 73581d3

Simplify README.md

Browse files

Files changed (1) hide show

README.md +17 -40

README.md CHANGED Viewed

@@ -12,48 +12,25 @@ short_description: Embedl - efficient AI for the edge
 # Embedl
-<img src="https://huggingface.co/datasets/embedl/documentation-images/resolve/main/organization_banner.png" alt="Embedl Organization Banner" width="100%">
 Embedl develops advanced tools and algorithms for **Edge AI**. Our mission is to make AI models run
 **faster**, **more energy-efficient**, and **reliably across diverse hardware platforms**, while
 significantly reducing development time.
-We help teams deploy high-performance AI on real-world, resource-constrained devices.
-### **Embedl Models** ([Community](https://github.com/embedl/embedl-models))
-Pre-optimized models that can be used **off-the-shelf** or customized for specific hardware target
-supported by the [embedl-models](https://github.com/embedl/embedl-models) package.
-**First release highlights:**
-- The **fastest Small Language Models (SLMs)** using **[FlashHead](https://www.embedl.com/knowledge/ultra-efficient-llms-embedls-breakthrough-for-on-device-ai)**,
-  a novel architectural improvement to the language-model head
-- Works with popular models like **Llama, Gemma, and Qwen**
-- Provides speedups on top of:
-  - Quantization
-  - Flash Attention
-  - Other standard optimizations
-Device: Nvidia Jetson Thor
-| Model                                            | Generation speed (tokens/s) |
-| ------------------------------------------------ | ----------------------------|
-| embedl/Llama-3.2-3B-Instruct-FlashHead-W4A16     | 100                         |
-| Llama-3.2-3B-Instruct-W4A16*                     | 80                          |
-| RedHatAI/Llama-3.2-3B-Instruct-FP8               | 64                          |
-| meta-llama/Llama-3.2-3B-Instruct                 | 37                          |
-*Embedl quantized model for benchmarking similar to the FlashHead-W4A16 but without
-the faster FlashHead and custom generation loop.
----
-## Contact
-**Headquarters (Sweden)**
-Gamla Almedalsvägen 39
-412 63 Gothenburg, Sweden
-**Email:** contact@embedl.com

 # Embedl
+<img src="https://huggingface.co/datasets/embedl/documentation-images/resolve/main/organization_banner.png" alt="Embedl Organization Banner" width="100%">
+<p align="center">
+  <b>Efficient AI for the edge.</b>
+</p>
+<p align="center">
+  <a href="https://embedl.com"><img alt="Website" src="https://img.shields.io/badge/embedl.com-website-blue" /></a>
+  <a href="https://github.com/embedl"><img alt="GitHub" src="https://img.shields.io/badge/GitHub-embedl-black?logo=github" /></a>
+  <a href="https://arxiv.org/abs/2603.14591"><img alt="arXiv"
+  src="https://img.shields.io/badge/arXiv-2603.14591-b31b1b.svg?logo=arxiv" /></a>
+  <a href="mailto:models@embedl.com"><img alt="Contact" src="https://img.shields.io/badge/Contact-models%40embedl.com-green" /></a>
+</p>
 Embedl develops advanced tools and algorithms for **Edge AI**. Our mission is to make AI models run
 **faster**, **more energy-efficient**, and **reliably across diverse hardware platforms**, while
 significantly reducing development time.
+We help teams deploy high-performance AI on real-world, resource-constrained devices.