49 5 33

Thien Tran

gaunernst

freelerobot's profile picture

BlueRed12's profile picture

thinhlpg's profile picture

gau-nernst

AI & ML interests

None yet

Recent Activity

upvoted a collection about 2 months ago

Gemma 4

updated a Space 5 months ago

gaunernst/AudioMAE-AudioSet20k

updated a Space 5 months ago

gaunernst/kv-cache-calculator

View all activity

Organizations

gaunernst 's collections 11

DeepSeek testing

A collection of MoE+MLA models, serving as testing proxies for DeepSeek-V3/R1

deepseek-ai/DeepSeek-V2-Lite-Chat

Text Generation • 16B • Updated Jun 25, 2024 • 1.13M • 138
gaunernst/DeepSeek-V2-Lite-Chat-FP8

16B • Updated Apr 7, 2025 • 19.4k
TechxGenus/DeepSeek-V2-Lite-Chat-AWQ

Text Generation • 16B • Updated Jul 4, 2024 • 208 • 3
deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27, 2025 • 4.58M • • 13.3k

Gemma 3 QAT INT4 (from Flax)

These are converted from the official QAT INT4 Flax checkpoints on Kaggle. Supported formats: AutoAWQ, GGUF

gaunernst/gemma-3-1b-it-int4-awq

Text Generation • Updated Apr 6, 2025 • 384 • 2
gaunernst/gemma-3-4b-it-int4-awq

Image-Text-to-Text • Updated Apr 6, 2025 • 2.9k • 7
gaunernst/gemma-3-12b-it-int4-awq

Image-Text-to-Text • 12B • Updated Apr 6, 2025 • 70.9k • 24
gaunernst/gemma-3-27b-it-int4-awq

Image-Text-to-Text • 27B • Updated Apr 6, 2025 • 197k • 40

Face Recognition Models

gaunernst/vit_small_patch8_gap_112.cosface_ms1mv3

Image Feature Extraction • Updated Apr 29, 2024 • 421 • 2
gaunernst/vit_tiny_patch8_112.cosface_ms1mv3

Image Feature Extraction • Updated Apr 21, 2024 • 11 • 2
gaunernst/vit_tiny_patch8_112.arcface_ms1mv3

Image Feature Extraction • Updated Apr 22, 2024 • 156 • 4
gaunernst/vit_tiny_patch8_112.adaface_ms1mv3

Image Feature Extraction • Updated Apr 25, 2024 • 19 • 2

LLMs 1B - 2B

TRI-ML/DCLM-1B-v0

1B • Updated Jul 25, 2024 • 6 • 13
Qwen/Qwen2-1.5B

Text Generation • 2B • Updated Jun 6, 2024 • 130k • • 102
Qwen/Qwen2-1.5B-Instruct

Text Generation • 2B • Updated Jun 6, 2024 • 4.46M • • 162
HuggingFaceTB/SmolLM-1.7B

Text Generation • 2B • Updated Oct 16, 2024 • 55.8k • 181

Smallish LLM pre-training datasets

roneneldan/TinyStories

Viewer • Updated Aug 12, 2024 • 2.14M • 91.4k • 1k
allenai/c4

Viewer • Updated Jan 9, 2024 • 10.4B • 747k • 577
HuggingFaceFW/fineweb-edu

Viewer • Updated Jul 11, 2025 • 3.5B • 634k • 1.09k
HuggingFaceTB/smollm-corpus

Viewer • Updated Sep 6, 2024 • 237M • 62.7k • 457

Llama3-compatible

nvidia/Llama-3.1-Minitron-4B-Width-Base

Text Generation • 5B • Updated Feb 14, 2025 • 2.06k • 194
nvidia/Llama-3.1-Minitron-4B-Depth-Base

Text Generation • 5B • Updated Feb 14, 2025 • 1.01k • 22
meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 10.6M • • 5.9k
meta-llama/Llama-3.1-8B

Text Generation • 8B • Updated Oct 16, 2024 • 1.25M • • 2.21k

Gemma 3 QAT INT4 (from GGUF)

Convert official Gemma 3 QAT GGUF to AutoAWQ and compressed-tensors format for ease of deployment

gaunernst/gemma-3-1b-it-qat-autoawq

Text Generation • Updated Apr 6, 2025 • 2
gaunernst/gemma-3-4b-it-qat-autoawq

Image-Text-to-Text • Updated Apr 6, 2025 • 848 • 2
gaunernst/gemma-3-12b-it-qat-autoawq

Image-Text-to-Text • 12B • Updated Apr 7, 2025 • 220 • 7
gaunernst/gemma-3-27b-it-qat-autoawq

Image-Text-to-Text • 27B • Updated Apr 20, 2025 • 805 • 12

Mini BERT models

https://arxiv.org/abs/1908.08962

gaunernst/bert-tiny-uncased

Fill-Mask • 4.43M • Updated Oct 25, 2024 • 188 • 6
gaunernst/bert-mini-uncased

Fill-Mask • 11.3M • Updated Oct 26, 2024 • 6.71k
gaunernst/bert-medium-uncased

Fill-Mask • 41.7M • Updated Jan 5, 2025 • 8
gaunernst/bert-small-uncased

Fill-Mask • 29.1M • Updated Jan 5, 2025 • 656

LLMs < 1B

Qwen/Qwen2-0.5B

Text Generation • 0.5B • Updated Oct 22, 2024 • 595k • • 167
Qwen/Qwen2-0.5B-Instruct

Text Generation • 0.5B • Updated Aug 21, 2024 • 1.52M • • 201
HuggingFaceTB/SmolLM-135M

Text Generation • 0.1B • Updated Aug 1, 2024 • 406k • 257
HuggingFaceTB/SmolLM-135M-Instruct

Text Generation • 0.1B • Updated Sep 4, 2024 • 39.4k • 137

LLMs 2B - 4B

google/gemma-2b-it

Text Generation • 3B • Updated Sep 27, 2024 • 65.9k • • 891
google/gemma-2b

Text Generation • 3B • Updated Sep 27, 2024 • 331k • • 1.18k
google/gemma-1.1-2b-it

Text Generation • 3B • Updated Jun 27, 2024 • 178k • 173
apple/OpenELM-3B

Text Generation • 3B • Updated Feb 28, 2025 • 388 • 130

Llama2-compatible

TinyLlama/TinyLlama_v1.1

Text Generation • Updated Jun 7, 2024 • 13.6k • 112
TinyLlama/TinyLlama-1.1B-Chat-v1.0

Text Generation • 1B • Updated Mar 17, 2024 • 2.32M • • 1.59k
meta-llama/Llama-2-7b-hf

Text Generation • 7B • Updated Apr 17, 2024 • 814k • 2.31k
meta-llama/Llama-2-7b-chat-hf

Text Generation • 7B • Updated Apr 17, 2024 • 356k • 4.76k

DeepSeek testing

A collection of MoE+MLA models, serving as testing proxies for DeepSeek-V3/R1

deepseek-ai/DeepSeek-V2-Lite-Chat

Text Generation • 16B • Updated Jun 25, 2024 • 1.13M • 138
gaunernst/DeepSeek-V2-Lite-Chat-FP8

16B • Updated Apr 7, 2025 • 19.4k
TechxGenus/DeepSeek-V2-Lite-Chat-AWQ

Text Generation • 16B • Updated Jul 4, 2024 • 208 • 3
deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27, 2025 • 4.58M • • 13.3k

Gemma 3 QAT INT4 (from GGUF)

Convert official Gemma 3 QAT GGUF to AutoAWQ and compressed-tensors format for ease of deployment

gaunernst/gemma-3-1b-it-qat-autoawq

Text Generation • Updated Apr 6, 2025 • 2
gaunernst/gemma-3-4b-it-qat-autoawq

Image-Text-to-Text • Updated Apr 6, 2025 • 848 • 2
gaunernst/gemma-3-12b-it-qat-autoawq

Image-Text-to-Text • 12B • Updated Apr 7, 2025 • 220 • 7
gaunernst/gemma-3-27b-it-qat-autoawq

Image-Text-to-Text • 27B • Updated Apr 20, 2025 • 805 • 12

Gemma 3 QAT INT4 (from Flax)

These are converted from the official QAT INT4 Flax checkpoints on Kaggle. Supported formats: AutoAWQ, GGUF

gaunernst/gemma-3-1b-it-int4-awq

Text Generation • Updated Apr 6, 2025 • 384 • 2
gaunernst/gemma-3-4b-it-int4-awq

Image-Text-to-Text • Updated Apr 6, 2025 • 2.9k • 7
gaunernst/gemma-3-12b-it-int4-awq

Image-Text-to-Text • 12B • Updated Apr 6, 2025 • 70.9k • 24
gaunernst/gemma-3-27b-it-int4-awq

Image-Text-to-Text • 27B • Updated Apr 6, 2025 • 197k • 40

Mini BERT models

https://arxiv.org/abs/1908.08962

gaunernst/bert-tiny-uncased

Fill-Mask • 4.43M • Updated Oct 25, 2024 • 188 • 6
gaunernst/bert-mini-uncased

Fill-Mask • 11.3M • Updated Oct 26, 2024 • 6.71k
gaunernst/bert-medium-uncased

Fill-Mask • 41.7M • Updated Jan 5, 2025 • 8
gaunernst/bert-small-uncased

Fill-Mask • 29.1M • Updated Jan 5, 2025 • 656

Face Recognition Models

gaunernst/vit_small_patch8_gap_112.cosface_ms1mv3

Image Feature Extraction • Updated Apr 29, 2024 • 421 • 2
gaunernst/vit_tiny_patch8_112.cosface_ms1mv3

Image Feature Extraction • Updated Apr 21, 2024 • 11 • 2
gaunernst/vit_tiny_patch8_112.arcface_ms1mv3

Image Feature Extraction • Updated Apr 22, 2024 • 156 • 4
gaunernst/vit_tiny_patch8_112.adaface_ms1mv3

Image Feature Extraction • Updated Apr 25, 2024 • 19 • 2

LLMs < 1B

Qwen/Qwen2-0.5B

Text Generation • 0.5B • Updated Oct 22, 2024 • 595k • • 167
Qwen/Qwen2-0.5B-Instruct

Text Generation • 0.5B • Updated Aug 21, 2024 • 1.52M • • 201
HuggingFaceTB/SmolLM-135M

Text Generation • 0.1B • Updated Aug 1, 2024 • 406k • 257
HuggingFaceTB/SmolLM-135M-Instruct

Text Generation • 0.1B • Updated Sep 4, 2024 • 39.4k • 137

LLMs 1B - 2B

TRI-ML/DCLM-1B-v0

1B • Updated Jul 25, 2024 • 6 • 13
Qwen/Qwen2-1.5B

Text Generation • 2B • Updated Jun 6, 2024 • 130k • • 102
Qwen/Qwen2-1.5B-Instruct

Text Generation • 2B • Updated Jun 6, 2024 • 4.46M • • 162
HuggingFaceTB/SmolLM-1.7B

Text Generation • 2B • Updated Oct 16, 2024 • 55.8k • 181

LLMs 2B - 4B

google/gemma-2b-it

Text Generation • 3B • Updated Sep 27, 2024 • 65.9k • • 891
google/gemma-2b

Text Generation • 3B • Updated Sep 27, 2024 • 331k • • 1.18k
google/gemma-1.1-2b-it

Text Generation • 3B • Updated Jun 27, 2024 • 178k • 173
apple/OpenELM-3B

Text Generation • 3B • Updated Feb 28, 2025 • 388 • 130

Smallish LLM pre-training datasets

roneneldan/TinyStories

Viewer • Updated Aug 12, 2024 • 2.14M • 91.4k • 1k
allenai/c4

Viewer • Updated Jan 9, 2024 • 10.4B • 747k • 577
HuggingFaceFW/fineweb-edu

Viewer • Updated Jul 11, 2025 • 3.5B • 634k • 1.09k
HuggingFaceTB/smollm-corpus

Viewer • Updated Sep 6, 2024 • 237M • 62.7k • 457

Llama2-compatible

TinyLlama/TinyLlama_v1.1

Text Generation • Updated Jun 7, 2024 • 13.6k • 112
TinyLlama/TinyLlama-1.1B-Chat-v1.0

Text Generation • 1B • Updated Mar 17, 2024 • 2.32M • • 1.59k
meta-llama/Llama-2-7b-hf

Text Generation • 7B • Updated Apr 17, 2024 • 814k • 2.31k
meta-llama/Llama-2-7b-chat-hf

Text Generation • 7B • Updated Apr 17, 2024 • 356k • 4.76k

Llama3-compatible

nvidia/Llama-3.1-Minitron-4B-Width-Base

Text Generation • 5B • Updated Feb 14, 2025 • 2.06k • 194
nvidia/Llama-3.1-Minitron-4B-Depth-Base

Text Generation • 5B • Updated Feb 14, 2025 • 1.01k • 22
meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 10.6M • • 5.9k
meta-llama/Llama-3.1-8B

Text Generation • 8B • Updated Oct 16, 2024 • 1.25M • • 2.21k