GeniusProAI
/

geniuspro-coder-v1

Text Generation

openai-compatible

Model card Files Files and versions

geniuspro-coder-v1 / README.md

StabIeGenius's picture

Update README.md

794ab9e verified 6 days ago

|

history blame contribute delete

2.92 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- code
	- assistant
	- ollama
	- openai-compatible
	- streaming
	- voice
	pipeline_tag: text-generation
	inference: false
	---

	# GeniusPro Coder v1

	GeniusPro Coder v1 is a coding-focused AI assistant model built for providing intelligent code generation, explanation, and general-purpose AI assistance.

	## Highlights

	- Code generation, debugging, and explanation across multiple languages
	- Natural conversational ability for non-code tasks
	- OpenAI-compatible API (drop-in replacement for existing tooling)
	- Streaming support for real-time token delivery
	- Voice mode with concise, spoken-friendly responses
	- Runs locally on consumer hardware via [Ollama](https://ollama.com)

	## Intended Use

	GeniusPro Coder v1 is designed for:

	- Code assistance — generating, reviewing, debugging, and explaining code
	- Chat — general-purpose question answering and conversation
	- Voice interaction — concise, natural-language responses optimized for text-to-speech

	It powers the GeniusPro platform, which includes a web-based chat dashboard and a real-time voice assistant.


	### Supported Parameters

	\| Parameter \| Description \|
	\|-----------\|-------------\|
	\| `temperature` \| Controls randomness (0.0 = deterministic, 1.0 = creative) \|
	\| `top_p` \| Nucleus sampling threshold \|
	\| `max_tokens` \| Maximum tokens to generate \|
	\| `stop` \| Stop sequences \|
	\| `stream` \| Enable streaming responses (SSE) \|

	### Available Endpoints

	\| Endpoint \| Method \| Description \|
	\|----------\|--------\|-------------\|
	\| `/v1/models` \| GET \| List available models \|
	\| `/v1/chat/completions` \| POST \| Chat completions (streaming + non-streaming) \|
	\| `/v1/voice` \| WebSocket \| Real-time voice interaction \|
	\| `/health` \| GET \| Health check (no auth required) \|

	## Running Locally with Ollama

	```bash
	# Pull the model
	ollama pull geniuspro-coder-v1

	# Run interactively
	ollama run geniuspro-coder-v1

	# Serve via API
	ollama serve
	```

	Once running, the model is available at `http://localhost:11434` with the same OpenAI-compatible API format.

	## Infrastructure

	GeniusPro Coder v1 runs on dedicated hardware for low-latency inference:

	- GPU: NVIDIA RTX 5090 (32 GB VRAM)
	- Runtime: [Ollama](https://ollama.com) for model serving
	- Gateway: FastAPI reverse proxy with auth, rate limiting, and usage tracking
	- Deployment: Ubuntu Server behind Nginx + Cloudflare Tunnel

	## Limitations

	- Optimized for English. Other languages may work but are not officially supported.
	- Code generation quality varies by language — strongest in Python, JavaScript/TypeScript, and common web technologies.
	- Not suitable for safety-critical applications without human review.
	- Context window and output length are bounded by the underlying architecture.

	## License

	This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).