Text Generation
Transformers
English
zenith
tenstorrent
code
reasoning
Mixture of Experts
ring-attention
eq-adapter
matrix-corp
Instructions to use Matrix-Corp/Zenith-7b-V1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Matrix-Corp/Zenith-7b-V1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Matrix-Corp/Zenith-7b-V1")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Matrix-Corp/Zenith-7b-V1", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Matrix-Corp/Zenith-7b-V1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Matrix-Corp/Zenith-7b-V1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Matrix-Corp/Zenith-7b-V1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Matrix-Corp/Zenith-7b-V1
- SGLang
How to use Matrix-Corp/Zenith-7b-V1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Matrix-Corp/Zenith-7b-V1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Matrix-Corp/Zenith-7b-V1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Matrix-Corp/Zenith-7b-V1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Matrix-Corp/Zenith-7b-V1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Matrix-Corp/Zenith-7b-V1 with Docker Model Runner:
docker model run hf.co/Matrix-Corp/Zenith-7b-V1
| # Zenith-7B Model Configuration for Ollama | |
| # Standard GPU Model - V1 | |
| FROM Qwen/Qwen2.5-Coder-7B | |
| # System prompt emphasizing code generation and reasoning | |
| SYSTEM """ | |
| You are Zenith-7B, an advanced AI assistant with exceptional coding abilities and emotional intelligence. | |
| You excel at: | |
| - Writing clean, efficient, and well-documented code | |
| - Solving complex algorithmic problems | |
| - Understanding and responding to emotional context | |
| - Providing thoughtful, nuanced responses | |
| When coding: | |
| - Use best practices and proper error handling | |
| - Add comments to explain complex logic | |
| - Consider edge cases and performance | |
| - Follow language-specific conventions | |
| When discussing emotional topics: | |
| - Show empathy and understanding | |
| - Recognize frustration and respond appropriately | |
| - Provide supportive and constructive feedback | |
| Always be helpful, accurate, and respectful. | |
| """ | |
| # Generation parameters optimized for code and reasoning | |
| PARAMETER temperature 0.65 | |
| PARAMETER top_p 0.88 | |
| PARAMETER top_k 45 | |
| PARAMETER repeat_penalty 1.08 | |
| PARAMETER num_predict 4096 | |
| # Context window (adjust based on your hardware) | |
| PARAMETER num_ctx 8192 | |
| # Chat template for Qwen format | |
| TEMPLATE """ | |
| {{- if .Messages }} | |
| {{- $role := .Messages | first | .Role }} | |
| {{- if or (eq $role "user") (eq $role "system") }} | |
| {{- range $i, $_ := .Messages }} | |
| {{- if eq .Role "user" }} | |
| {{- "\nUser: " }}{{ .Content }} | |
| {{- else if eq .Role "assistant" }} | |
| {{- "\nAssistant: " }}{{ .Content }} | |
| {{- else if eq .Role "system" }} | |
| {{- "\nSystem: " }}{{ .Content }} | |
| {{- end }} | |
| {{- end }} | |
| {{- "\nAssistant:" }} | |
| {{- else }} | |
| {{- range $i, $_ := .Messages }} | |
| {{- if eq .Role "user" }} | |
| {{- "\nUser: " }}{{ .Content }} | |
| {{- else if eq .Role "assistant" }} | |
| {{- "\nAssistant: " }}{{ .Content }} | |
| {{- end }} | |
| {{- end }} | |
| {{- "\nAssistant:" }} | |
| {{- end }} | |
| {{- else }} | |
| {{- .Prompt }} | |
| {{- end }} | |
| """ | |
| # Stop sequences | |
| STOP ["User:", "System:", "\n\n"] |