| --- |
| license: apache-2.0 |
| tags: |
| - Image-Text-to-Text Models |
| - Audio-Text-to-Text |
| - text-to-Text |
| - llama_cpp |
| - any to any |
| - Multimodal AI |
| - video-Text-to-Text |
| - Llama Model Switcher |
| --- |
| nzgnzg73 |
|
|
| llama_cpp_WebUI |
|
|
| Github |
| https://github.com/nzgnzg73/llama_cpp_WebUI |
|
|
| ## Want to talk or ask something? |
| Just click the YouTube link below! You'll find my π§ email there and can message me easily. π |
|
|
| π₯ YouTube Channel: @nzg73 |
| π https://youtube.com/@NZG73 |
|
|
|
|
| ## Contact Email πππ |
| E-mail:- |
| nzgnzg73@gmail.com |
|
|
|
|
| llama Cpp (cp\gpu) |
| Old Version llama cpp ππππ |
| llama-b7200-bin-win-cpu-x64.zip |
|
|
|
|
| NEW Update Version llama cpp ππππ |
|
|
| llama-b7541-bin-win-cpu-x64 |
|
|
| llama-b7243-bin-win-cuda-12.4-x64 |
|
|
|
|
|
|
|  |
|
|
|
|
| ## Llama Model Switcher |
|
|
| CMD |
| model_switcher.py |
| |
| pip install flask |
| |
| pip install flask psutil GPUtil |
| |
|  |
| |
| |
| |
|  |
| |
| ## Image-Text-to-Text Models |
| |
| ## Gemma-3 |
| |
| CPU. RAM 20GB |
| OR |
| GPU. 4 VRAM |
| |
| |
|  |
| |
| 1. gemma-3-12b-it-Q4_K_S.gguf |
| 2. mmproj-model-f16-12B.gguf |
| |
| |
| ## -Text-to-Text Models |
| ## GPT OSS 20 |
| |
| |
| |
| |
|  |
| |
| |
| |
| |
| |
|  |
| |
| |
| |
| |
| ## Qwen3 |
| |
| |
| CPU. RAM 25GB |
| OR |
| GPU. 4 VRAM |
| |
| |
| 1. Qwen3-VL-2B-Instruct-Q8_0.gguf |
| 2. mmproj-Qwen3-VL-2B-Instruct-Q8_0.gguf |
| |
| |
| |
|  |
| |
| |
| ## Qwen2.5-Omni |
| |
| |
| CPU. RAM 40GB |
| OR |
| GPU. 8 VRAM |
| |
| 1. Qwen2.5-Omni-7B-BF16.gguf |
| 2. mmproj-F16.gguf 2GB |
| |
| |
| |
|  |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| ## Audio-Text-to-Text |
| |
| |
| ## Llama-3.2 |
| |
| |
| CPU. RAM 10GB |
| |
| |
| |
| |
|  |
| |
| |
| 1. Llama-3.2-1B-Instruct-Q4_K_M.gguf |
| 2. Llama-3.2-1B-Instruct-Q8_0.gguf |
| 3. mmproj-ultravox-v0_5-llama-3_2-1b-f16.gguf |
|
|
|
|
|
|
| ## run.bat |
|
|
|
|
|
|
|
|
| Local Server |
|
|
| llama-server.exe --n-gpu-layers 2 --ctx-size 111192 -m ".\models\mistralai\mistralai_Voxtral-Mini-3B-2507-Q8_0.gguf" --mmproj ".\models\mistralai\mmproj-mistralai_Voxtral-Mini-3B-2507-bf16.gguf" --host 0.0.0.0Β --portΒ 8005 |
| |
| |
| public URL |
| |
| llama-server --n-gpu-layers 15 --ctx-size 8192 -m models/ollma/Llama-3.2-1B-Instruct-Q8_0.gguf --mmproj models/ollma/mmproj-ultravox-v0_5-llama-3_2-1b-f16.gguf --host 127.0.0.1Β --portΒ 8083 |