|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- code |
|
|
- assistant |
|
|
- ollama |
|
|
- openai-compatible |
|
|
- streaming |
|
|
- voice |
|
|
pipeline_tag: text-generation |
|
|
inference: false |
|
|
--- |
|
|
|
|
|
# GeniusPro Coder v1 |
|
|
|
|
|
**GeniusPro Coder v1** is a coding-focused AI assistant model built for providing intelligent code generation, explanation, and general-purpose AI assistance. |
|
|
|
|
|
## Highlights |
|
|
|
|
|
- Code generation, debugging, and explanation across multiple languages |
|
|
- Natural conversational ability for non-code tasks |
|
|
- OpenAI-compatible API (drop-in replacement for existing tooling) |
|
|
- Streaming support for real-time token delivery |
|
|
- Voice mode with concise, spoken-friendly responses |
|
|
- Runs locally on consumer hardware via [Ollama](https://ollama.com) |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
GeniusPro Coder v1 is designed for: |
|
|
|
|
|
- **Code assistance** — generating, reviewing, debugging, and explaining code |
|
|
- **Chat** — general-purpose question answering and conversation |
|
|
- **Voice interaction** — concise, natural-language responses optimized for text-to-speech |
|
|
|
|
|
It powers the GeniusPro platform, which includes a web-based chat dashboard and a real-time voice assistant. |
|
|
|
|
|
|
|
|
### Supported Parameters |
|
|
|
|
|
| Parameter | Description | |
|
|
|-----------|-------------| |
|
|
| `temperature` | Controls randomness (0.0 = deterministic, 1.0 = creative) | |
|
|
| `top_p` | Nucleus sampling threshold | |
|
|
| `max_tokens` | Maximum tokens to generate | |
|
|
| `stop` | Stop sequences | |
|
|
| `stream` | Enable streaming responses (SSE) | |
|
|
|
|
|
### Available Endpoints |
|
|
|
|
|
| Endpoint | Method | Description | |
|
|
|----------|--------|-------------| |
|
|
| `/v1/models` | GET | List available models | |
|
|
| `/v1/chat/completions` | POST | Chat completions (streaming + non-streaming) | |
|
|
| `/v1/voice` | WebSocket | Real-time voice interaction | |
|
|
| `/health` | GET | Health check (no auth required) | |
|
|
|
|
|
## Running Locally with Ollama |
|
|
|
|
|
```bash |
|
|
# Pull the model |
|
|
ollama pull geniuspro-coder-v1 |
|
|
|
|
|
# Run interactively |
|
|
ollama run geniuspro-coder-v1 |
|
|
|
|
|
# Serve via API |
|
|
ollama serve |
|
|
``` |
|
|
|
|
|
Once running, the model is available at `http://localhost:11434` with the same OpenAI-compatible API format. |
|
|
|
|
|
## Infrastructure |
|
|
|
|
|
GeniusPro Coder v1 runs on dedicated hardware for low-latency inference: |
|
|
|
|
|
- **GPU**: NVIDIA RTX 5090 (32 GB VRAM) |
|
|
- **Runtime**: [Ollama](https://ollama.com) for model serving |
|
|
- **Gateway**: FastAPI reverse proxy with auth, rate limiting, and usage tracking |
|
|
- **Deployment**: Ubuntu Server behind Nginx + Cloudflare Tunnel |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Optimized for English. Other languages may work but are not officially supported. |
|
|
- Code generation quality varies by language — strongest in Python, JavaScript/TypeScript, and common web technologies. |
|
|
- Not suitable for safety-critical applications without human review. |
|
|
- Context window and output length are bounded by the underlying architecture. |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). |
|
|
|
|
|
|
|
|
|