--- license: apache-2.0 language: - en tags: - code - assistant - ollama - openai-compatible - streaming - voice pipeline_tag: text-generation inference: false --- # GeniusPro Coder v1 **GeniusPro Coder v1** is a coding-focused AI assistant model built for providing intelligent code generation, explanation, and general-purpose AI assistance. ## Highlights - Code generation, debugging, and explanation across multiple languages - Natural conversational ability for non-code tasks - OpenAI-compatible API (drop-in replacement for existing tooling) - Streaming support for real-time token delivery - Voice mode with concise, spoken-friendly responses - Runs locally on consumer hardware via [Ollama](https://ollama.com) ## Intended Use GeniusPro Coder v1 is designed for: - **Code assistance** — generating, reviewing, debugging, and explaining code - **Chat** — general-purpose question answering and conversation - **Voice interaction** — concise, natural-language responses optimized for text-to-speech It powers the GeniusPro platform, which includes a web-based chat dashboard and a real-time voice assistant. ### Supported Parameters | Parameter | Description | |-----------|-------------| | `temperature` | Controls randomness (0.0 = deterministic, 1.0 = creative) | | `top_p` | Nucleus sampling threshold | | `max_tokens` | Maximum tokens to generate | | `stop` | Stop sequences | | `stream` | Enable streaming responses (SSE) | ### Available Endpoints | Endpoint | Method | Description | |----------|--------|-------------| | `/v1/models` | GET | List available models | | `/v1/chat/completions` | POST | Chat completions (streaming + non-streaming) | | `/v1/voice` | WebSocket | Real-time voice interaction | | `/health` | GET | Health check (no auth required) | ## Running Locally with Ollama ```bash # Pull the model ollama pull geniuspro-coder-v1 # Run interactively ollama run geniuspro-coder-v1 # Serve via API ollama serve ``` Once running, the model is available at `http://localhost:11434` with the same OpenAI-compatible API format. ## Infrastructure GeniusPro Coder v1 runs on dedicated hardware for low-latency inference: - **GPU**: NVIDIA RTX 5090 (32 GB VRAM) - **Runtime**: [Ollama](https://ollama.com) for model serving - **Gateway**: FastAPI reverse proxy with auth, rate limiting, and usage tracking - **Deployment**: Ubuntu Server behind Nginx + Cloudflare Tunnel ## Limitations - Optimized for English. Other languages may work but are not officially supported. - Code generation quality varies by language — strongest in Python, JavaScript/TypeScript, and common web technologies. - Not suitable for safety-critical applications without human review. - Context window and output length are bounded by the underlying architecture. ## License This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).