Quantized verisons of google/diffusiongemma-26B-A4B-it
AI & ML interests
OpenSource and AI
Recent Activity
View all activity
Papers
SNLP: Layer-Parallel Inference via Structured Newton Corrections
S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation
March 2026 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
January 2026 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
-
RedHatAI/Mistral-Large-3-675B-Instruct-2512
Updated • 3 • 1 -
RedHatAI/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
Text Generation • 32B • Updated • 2.06k • 4 -
RedHatAI/Mistral-Large-3-675B-Instruct-2512-NVFP4
Updated • 7 • 3 -
RedHatAI/Apertus-8B-Instruct-2509-FP8-dynamic
Text Generation • 8B • Updated • 1.12k • 3
-
RedHatAI/Mistral-Small-3.2-24B-Instruct-2506-NVFP4
Text Generation • 14B • Updated • 2.67k • 9 -
RedHatAI/Qwen3-VL-235B-A22B-Instruct-NVFP4
Text Generation • 133B • Updated • 1.39k • 15 -
RedHatAI/Qwen3-235B-A22B-Instruct-2507-NVFP4
Text Generation • 136B • Updated • 3.93k • 4 -
RedHatAI/Qwen3-235B-A22B-NVFP4
Text Generation • 136B • Updated • 66 • 1
September 2025 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
-
RedHatAI/DeepSeek-R1-0528-quantized.w4a16
Text Generation • 676B • Updated • 227 • 13 -
RedHatAI/Qwen3-8B-FP8-dynamic
Text Generation • 8B • Updated • 46.7k • 12 -
RedHatAI/Kimi-K2-Instruct-quantized.w4a16
Text Generation • 1T • Updated • 300 • 12 -
RedHatAI/gemma-3n-E4B-it-FP8-dynamic
Text Generation • 8B • Updated • 566 • 4
May 2025 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
-
RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic
Image-Text-to-Text • 109B • Updated • 10.8k • 29 -
RedHatAI/Llama-4-Scout-17B-16E-Instruct-quantized.w4a16
Image-Text-to-Text • 109B • Updated • 9.04k • 13 -
RedHatAI/Llama-4-Scout-17B-16E-Instruct
Image-Text-to-Text • 109B • Updated • 7.92k -
RedHatAI/Llama-4-Maverick-17B-128E-Instruct
Image-Text-to-Text • 402B • Updated • 148 • 3
Collection of quantized Gemma 3 models created by Google.
-
RedHatAI/gemma-3-27b-it-quantized.w4a16
Image-Text-to-Text • 29B • Updated • 331k • 13 -
RedHatAI/gemma-3-12b-it-quantized.w4a16
Image-Text-to-Text • 13B • Updated • 5.15k • 3 -
RedHatAI/gemma-3-4b-it-quantized.w4a16
Image-Text-to-Text • 5B • Updated • 3.24k • 5 -
RedHatAI/gemma-3-1b-it-quantized.w8a8
Text Generation • 1B • Updated • 1.1k • 2
Quantized variants of the Llama 4 release by Meta.
-
RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic
Image-Text-to-Text • 109B • Updated • 10.8k • 29 -
RedHatAI/Llama-4-Scout-17B-16E-Instruct-quantized.w4a16
Image-Text-to-Text • 109B • Updated • 9.04k • 13 -
RedHatAI/Llama-4-Maverick-17B-128E-Instruct-FP8
Image-Text-to-Text • 402B • Updated • 5.46k • 2 -
RedHatAI/Llama-4-Maverick-17B-128E-Instruct-quantized.w4a16
Image-Text-to-Text • 405B • Updated • 3.44k • 1
Quantized variants of Mistral Small 3.1 (2503) Instruct.
-
RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-FP8-dynamic
Image-Text-to-Text • 24B • Updated • 80.3k • 9 -
RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-quantized.w8a8
Image-Text-to-Text • 24B • Updated • 1.19k • 5 -
RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-quantized.w4a16
Image-Text-to-Text • 24B • Updated • 1.66k • 10
Quantized variants of Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out).
Quantized Granite models from IBM Research.
-
RedHatAI/granite-3.1-8b-instruct-quantized.w8a8
Text Generation • 8B • Updated • 372 • 2 -
RedHatAI/granite-3.1-2b-base-quantized.w8a8
Text Generation • 3B • Updated • 13 -
RedHatAI/granite-3.1-8b-instruct-quantized.w4a16
Text Generation • 8B • Updated • 1.16k • 1 -
RedHatAI/granite-3.1-2b-instruct-quantized.w8a8
Text Generation • 3B • Updated • 20
May 2026 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
-
RedHatAI/NVIDIA-Nemotron-3-Super-120B-A12B-BF16
Text Generation • 124B • Updated • 3.33k -
RedHatAI/NVIDIA-Nemotron-3-Super-120B-A12B-FP8
Text Generation • 124B • Updated • 2.02k • 1 -
RedHatAI/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4
Text Generation • 67B • Updated • 3.18k • 2 -
RedHatAI/Qwen3.5-397B-A17B-FP8-dynamic
Image-Text-to-Text • 397B • Updated • 979 • 5
February 2026 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
-
RedHatAI/Phi-4-reasoning-FP8-dynamic
Text Generation • 15B • Updated • 266 • 1 -
RedHatAI/Qwen3-VL-235B-A22B-Instruct-NVFP4
Text Generation • 133B • Updated • 1.39k • 15 -
RedHatAI/Qwen3-Next-80B-A3B-Instruct-quantized.w4a16
Text Generation • 12B • Updated • 864 • 3 -
RedHatAI/granite-4.0-h-tiny-FP8-dynamic
Text Generation • 7B • Updated • 718 • 3
-
RedHatAI/embeddinggemma-300m
Sentence Similarity • 0.3B • Updated • 777 • 1 -
RedHatAI/Qwen3-Embedding-8B
Feature Extraction • 8B • Updated • 180 • 1 -
RedHatAI/nomic-embed-text-v1.5
Sentence Similarity • 0.1B • Updated • 277 -
RedHatAI/granite-embedding-english-r2
Feature Extraction • 0.1B • Updated • 188
October 2025 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
-
RedHatAI/gpt-oss-120b
Text Generation • 120B • Updated • 899 • 5 -
RedHatAI/gpt-oss-20b
Text Generation • 22B • Updated • 17.2k • 6 -
RedHatAI/Qwen3-Coder-480B-A35B-Instruct-FP8
Text Generation • 480B • Updated • 225 • 3 -
RedHatAI/whisper-large-v3-turbo-quantized.w4a16
Automatic Speech Recognition • 0.9B • Updated • 1.72k • 8
-
RedHatAI/Llama-3.1-8B-Instruct-speculator.eagle3
Text Generation • 1.0B • Updated • 15.9k • 2 -
RedHatAI/Llama-3.3-70B-Instruct-speculator.eagle3
Text Generation • 2B • Updated • 1.76k • 1 -
RedHatAI/Qwen3-8B-speculator.eagle3
Text Generation • 1B • Updated • 58.9k • 29 -
RedHatAI/Qwen3-14B-speculator.eagle3
Text Generation • 1B • Updated • 183
Embedding models act as the bridge between raw, unstructured data and the numerical, vector-based format needed for efficient retrieval in GenAI apps.
-
RedHatAI/embeddinggemma-300m
Sentence Similarity • 0.3B • Updated • 777 • 1 -
RedHatAI/Qwen3-Embedding-8B
Feature Extraction • 8B • Updated • 180 • 1 -
RedHatAI/nomic-embed-text-v1.5
Sentence Similarity • 0.1B • Updated • 277 -
RedHatAI/granite-embedding-english-r2
Feature Extraction • 0.1B • Updated • 188
IBM and NASA have teamed up to create a family of AI foundation models for Earth called Prithvi.
Collection of quantized whisper models created by OpenAI
-
RedHatAI/whisper-large-v3-turbo-quantized.w4a16
Automatic Speech Recognition • 0.9B • Updated • 1.72k • 8 -
RedHatAI/whisper-large-v3-turbo-quantized.w8a8
Automatic Speech Recognition • 0.9B • Updated • 831 • 4 -
RedHatAI/whisper-large-v3-turbo-FP8-dynamic
Automatic Speech Recognition • 0.9B • Updated • 2.61k • 6 -
RedHatAI/whisper-tiny-FP8-Dynamic
Automatic Speech Recognition • 57.8M • Updated • 21
Collection of quantized Qwen 3 models from Alibaba Cloud.
-
RedHatAI/Qwen3-4B-quantized.w4a16
Text Generation • 4B • Updated • 14.4k • 4 -
RedHatAI/Qwen3-32B-FP8-dynamic
Text Generation • 33B • Updated • 4.4k • 15 -
RedHatAI/Qwen3-0.6B-FP8-dynamic
Text Generation • 0.8B • Updated • 615 • 1 -
RedHatAI/Qwen3-8B-FP8-dynamic
Text Generation • 8B • Updated • 46.7k • 12
Quantized variants of Phi-4 family of small language and multi-modal models by Microsoft.
Quantized variants of Qwen 2.5 Instruct and Qwen VL models
-
RedHatAI/Qwen2.5-VL-7B-Instruct-quantized.w8a8
Image-Text-to-Text • 8B • Updated • 3.42k • 9 -
RedHatAI/Qwen2.5-VL-7B-Instruct-quantized.w4a16
Image-Text-to-Text • 8B • Updated • 2.22k • 8 -
RedHatAI/Qwen2.5-7B-quantized.w8a8
Text Generation • 8B • Updated • 86 • 1 -
RedHatAI/Qwen2.5-VL-72B-Instruct-FP8-dynamic
Image-Text-to-Text • 73B • Updated • 3.93k • 15
Collection of kernels from vLLM built using https://github.com/huggingface/kernel-builder
Quantized verisons of google/diffusiongemma-26B-A4B-it
May 2026 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
-
RedHatAI/NVIDIA-Nemotron-3-Super-120B-A12B-BF16
Text Generation • 124B • Updated • 3.33k -
RedHatAI/NVIDIA-Nemotron-3-Super-120B-A12B-FP8
Text Generation • 124B • Updated • 2.02k • 1 -
RedHatAI/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4
Text Generation • 67B • Updated • 3.18k • 2 -
RedHatAI/Qwen3.5-397B-A17B-FP8-dynamic
Image-Text-to-Text • 397B • Updated • 979 • 5
March 2026 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
February 2026 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
-
RedHatAI/Phi-4-reasoning-FP8-dynamic
Text Generation • 15B • Updated • 266 • 1 -
RedHatAI/Qwen3-VL-235B-A22B-Instruct-NVFP4
Text Generation • 133B • Updated • 1.39k • 15 -
RedHatAI/Qwen3-Next-80B-A3B-Instruct-quantized.w4a16
Text Generation • 12B • Updated • 864 • 3 -
RedHatAI/granite-4.0-h-tiny-FP8-dynamic
Text Generation • 7B • Updated • 718 • 3
January 2026 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
-
RedHatAI/Mistral-Large-3-675B-Instruct-2512
Updated • 3 • 1 -
RedHatAI/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
Text Generation • 32B • Updated • 2.06k • 4 -
RedHatAI/Mistral-Large-3-675B-Instruct-2512-NVFP4
Updated • 7 • 3 -
RedHatAI/Apertus-8B-Instruct-2509-FP8-dynamic
Text Generation • 8B • Updated • 1.12k • 3
-
RedHatAI/embeddinggemma-300m
Sentence Similarity • 0.3B • Updated • 777 • 1 -
RedHatAI/Qwen3-Embedding-8B
Feature Extraction • 8B • Updated • 180 • 1 -
RedHatAI/nomic-embed-text-v1.5
Sentence Similarity • 0.1B • Updated • 277 -
RedHatAI/granite-embedding-english-r2
Feature Extraction • 0.1B • Updated • 188
-
RedHatAI/Mistral-Small-3.2-24B-Instruct-2506-NVFP4
Text Generation • 14B • Updated • 2.67k • 9 -
RedHatAI/Qwen3-VL-235B-A22B-Instruct-NVFP4
Text Generation • 133B • Updated • 1.39k • 15 -
RedHatAI/Qwen3-235B-A22B-Instruct-2507-NVFP4
Text Generation • 136B • Updated • 3.93k • 4 -
RedHatAI/Qwen3-235B-A22B-NVFP4
Text Generation • 136B • Updated • 66 • 1
October 2025 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
-
RedHatAI/gpt-oss-120b
Text Generation • 120B • Updated • 899 • 5 -
RedHatAI/gpt-oss-20b
Text Generation • 22B • Updated • 17.2k • 6 -
RedHatAI/Qwen3-Coder-480B-A35B-Instruct-FP8
Text Generation • 480B • Updated • 225 • 3 -
RedHatAI/whisper-large-v3-turbo-quantized.w4a16
Automatic Speech Recognition • 0.9B • Updated • 1.72k • 8
September 2025 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
-
RedHatAI/DeepSeek-R1-0528-quantized.w4a16
Text Generation • 676B • Updated • 227 • 13 -
RedHatAI/Qwen3-8B-FP8-dynamic
Text Generation • 8B • Updated • 46.7k • 12 -
RedHatAI/Kimi-K2-Instruct-quantized.w4a16
Text Generation • 1T • Updated • 300 • 12 -
RedHatAI/gemma-3n-E4B-it-FP8-dynamic
Text Generation • 8B • Updated • 566 • 4
-
RedHatAI/Llama-3.1-8B-Instruct-speculator.eagle3
Text Generation • 1.0B • Updated • 15.9k • 2 -
RedHatAI/Llama-3.3-70B-Instruct-speculator.eagle3
Text Generation • 2B • Updated • 1.76k • 1 -
RedHatAI/Qwen3-8B-speculator.eagle3
Text Generation • 1B • Updated • 58.9k • 29 -
RedHatAI/Qwen3-14B-speculator.eagle3
Text Generation • 1B • Updated • 183
May 2025 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio.
-
RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic
Image-Text-to-Text • 109B • Updated • 10.8k • 29 -
RedHatAI/Llama-4-Scout-17B-16E-Instruct-quantized.w4a16
Image-Text-to-Text • 109B • Updated • 9.04k • 13 -
RedHatAI/Llama-4-Scout-17B-16E-Instruct
Image-Text-to-Text • 109B • Updated • 7.92k -
RedHatAI/Llama-4-Maverick-17B-128E-Instruct
Image-Text-to-Text • 402B • Updated • 148 • 3
Embedding models act as the bridge between raw, unstructured data and the numerical, vector-based format needed for efficient retrieval in GenAI apps.
-
RedHatAI/embeddinggemma-300m
Sentence Similarity • 0.3B • Updated • 777 • 1 -
RedHatAI/Qwen3-Embedding-8B
Feature Extraction • 8B • Updated • 180 • 1 -
RedHatAI/nomic-embed-text-v1.5
Sentence Similarity • 0.1B • Updated • 277 -
RedHatAI/granite-embedding-english-r2
Feature Extraction • 0.1B • Updated • 188
IBM and NASA have teamed up to create a family of AI foundation models for Earth called Prithvi.
Collection of quantized Gemma 3 models created by Google.
-
RedHatAI/gemma-3-27b-it-quantized.w4a16
Image-Text-to-Text • 29B • Updated • 331k • 13 -
RedHatAI/gemma-3-12b-it-quantized.w4a16
Image-Text-to-Text • 13B • Updated • 5.15k • 3 -
RedHatAI/gemma-3-4b-it-quantized.w4a16
Image-Text-to-Text • 5B • Updated • 3.24k • 5 -
RedHatAI/gemma-3-1b-it-quantized.w8a8
Text Generation • 1B • Updated • 1.1k • 2
Collection of quantized whisper models created by OpenAI
-
RedHatAI/whisper-large-v3-turbo-quantized.w4a16
Automatic Speech Recognition • 0.9B • Updated • 1.72k • 8 -
RedHatAI/whisper-large-v3-turbo-quantized.w8a8
Automatic Speech Recognition • 0.9B • Updated • 831 • 4 -
RedHatAI/whisper-large-v3-turbo-FP8-dynamic
Automatic Speech Recognition • 0.9B • Updated • 2.61k • 6 -
RedHatAI/whisper-tiny-FP8-Dynamic
Automatic Speech Recognition • 57.8M • Updated • 21
Quantized variants of the Llama 4 release by Meta.
-
RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic
Image-Text-to-Text • 109B • Updated • 10.8k • 29 -
RedHatAI/Llama-4-Scout-17B-16E-Instruct-quantized.w4a16
Image-Text-to-Text • 109B • Updated • 9.04k • 13 -
RedHatAI/Llama-4-Maverick-17B-128E-Instruct-FP8
Image-Text-to-Text • 402B • Updated • 5.46k • 2 -
RedHatAI/Llama-4-Maverick-17B-128E-Instruct-quantized.w4a16
Image-Text-to-Text • 405B • Updated • 3.44k • 1
Collection of quantized Qwen 3 models from Alibaba Cloud.
-
RedHatAI/Qwen3-4B-quantized.w4a16
Text Generation • 4B • Updated • 14.4k • 4 -
RedHatAI/Qwen3-32B-FP8-dynamic
Text Generation • 33B • Updated • 4.4k • 15 -
RedHatAI/Qwen3-0.6B-FP8-dynamic
Text Generation • 0.8B • Updated • 615 • 1 -
RedHatAI/Qwen3-8B-FP8-dynamic
Text Generation • 8B • Updated • 46.7k • 12
Quantized variants of Mistral Small 3.1 (2503) Instruct.
-
RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-FP8-dynamic
Image-Text-to-Text • 24B • Updated • 80.3k • 9 -
RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-quantized.w8a8
Image-Text-to-Text • 24B • Updated • 1.19k • 5 -
RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-quantized.w4a16
Image-Text-to-Text • 24B • Updated • 1.66k • 10
Quantized variants of Phi-4 family of small language and multi-modal models by Microsoft.
Quantized variants of Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out).
Quantized variants of Qwen 2.5 Instruct and Qwen VL models
-
RedHatAI/Qwen2.5-VL-7B-Instruct-quantized.w8a8
Image-Text-to-Text • 8B • Updated • 3.42k • 9 -
RedHatAI/Qwen2.5-VL-7B-Instruct-quantized.w4a16
Image-Text-to-Text • 8B • Updated • 2.22k • 8 -
RedHatAI/Qwen2.5-7B-quantized.w8a8
Text Generation • 8B • Updated • 86 • 1 -
RedHatAI/Qwen2.5-VL-72B-Instruct-FP8-dynamic
Image-Text-to-Text • 73B • Updated • 3.93k • 15
Quantized Granite models from IBM Research.
-
RedHatAI/granite-3.1-8b-instruct-quantized.w8a8
Text Generation • 8B • Updated • 372 • 2 -
RedHatAI/granite-3.1-2b-base-quantized.w8a8
Text Generation • 3B • Updated • 13 -
RedHatAI/granite-3.1-8b-instruct-quantized.w4a16
Text Generation • 8B • Updated • 1.16k • 1 -
RedHatAI/granite-3.1-2b-instruct-quantized.w8a8
Text Generation • 3B • Updated • 20
Collection of kernels from vLLM built using https://github.com/huggingface/kernel-builder