File size: 2,920 Bytes
b82520a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
794ab9e
b82520a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
license: apache-2.0
language:
  - en
tags:
  - code
  - assistant
  - ollama
  - openai-compatible
  - streaming
  - voice
pipeline_tag: text-generation
inference: false
---

# GeniusPro Coder v1

**GeniusPro Coder v1** is a coding-focused AI assistant model built for providing intelligent code generation, explanation, and general-purpose AI assistance.

## Highlights

- Code generation, debugging, and explanation across multiple languages
- Natural conversational ability for non-code tasks
- OpenAI-compatible API (drop-in replacement for existing tooling)
- Streaming support for real-time token delivery
- Voice mode with concise, spoken-friendly responses
- Runs locally on consumer hardware via [Ollama](https://ollama.com)

## Intended Use

GeniusPro Coder v1 is designed for:

- **Code assistance** — generating, reviewing, debugging, and explaining code
- **Chat** — general-purpose question answering and conversation
- **Voice interaction** — concise, natural-language responses optimized for text-to-speech

It powers the GeniusPro platform, which includes a web-based chat dashboard and a real-time voice assistant.


### Supported Parameters

| Parameter | Description |
|-----------|-------------|
| `temperature` | Controls randomness (0.0 = deterministic, 1.0 = creative) |
| `top_p` | Nucleus sampling threshold |
| `max_tokens` | Maximum tokens to generate |
| `stop` | Stop sequences |
| `stream` | Enable streaming responses (SSE) |

### Available Endpoints

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/v1/models` | GET | List available models |
| `/v1/chat/completions` | POST | Chat completions (streaming + non-streaming) |
| `/v1/voice` | WebSocket | Real-time voice interaction |
| `/health` | GET | Health check (no auth required) |

## Running Locally with Ollama

```bash
# Pull the model
ollama pull geniuspro-coder-v1

# Run interactively
ollama run geniuspro-coder-v1

# Serve via API
ollama serve
```

Once running, the model is available at `http://localhost:11434` with the same OpenAI-compatible API format.

## Infrastructure

GeniusPro Coder v1 runs on dedicated hardware for low-latency inference:

- **GPU**: NVIDIA RTX 5090 (32 GB VRAM)
- **Runtime**: [Ollama](https://ollama.com) for model serving
- **Gateway**: FastAPI reverse proxy with auth, rate limiting, and usage tracking
- **Deployment**: Ubuntu Server behind Nginx + Cloudflare Tunnel

## Limitations

- Optimized for English. Other languages may work but are not officially supported.
- Code generation quality varies by language — strongest in Python, JavaScript/TypeScript, and common web technologies.
- Not suitable for safety-critical applications without human review.
- Context window and output length are bounded by the underlying architecture.

## License

This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).