Omni-Nexus
Alpha 7B
High-density reasoning engine built on Qwen2.5-Coder-7B. Stack-3.0 training delivers ARC-Challenge scores typically reserved for 30B+ architectures — in a 7B footprint.
Full Benchmark Results
| Benchmark | Score | Method | Notes |
|---|---|---|---|
| HumanEval | 85.37% (140/164) | 0-shot | Python code generation |
| ARC-Challenge | 83.28% (976/1172) | 0-shot | Science reasoning |
| MBPP | 79.80% (399/500) | 3-shot | Python problem solving |
| MMLU | 59.89% (8410/14042) | 5-shot | Multilingual understanding |
| HellaSwag | 59.61% (5986/10042) | 0-shot | Commonsense reasoning |
| GSM8K | 52.39% (691/1319) | 8-shot | Grade school math |
| Winogrande | 52.01% (659/1267) | 0-shot | Fill-in-blank reasoning |
| TruthfulQA | 45.04% (368/817) | 0-shot | Truthfulness against misconceptions |
Model Specifications
Quick Start — Ollama
# Pull directly ollama pull my-ai-stack.com/omni-nexus-alpha-q8# Or create a Modelfile cat << 'EOF' > Modelfile FROM ./Omni-Nexus-Alpha-Q8_0.gguf TEMPLATE """{{ if .System }}<|system|>{{ .System }}{{ end }}{{ if .Prompt }}<|user|>{{ .Prompt }}{{ end }}<|assistant|>{{ .Response }}""" PARAMETER temperature 0.7 PARAMETER top_p 0.9 EOF
ollama create omni-nexus-alpha -f Modelfile ollama run omni-nexus-alpha
All benchmarks evaluated on Google Cloud Tesla V100 16GB via Ollama inference engine.
GSM8K, HumanEval, MBPP, TruthfulQA, MMLU, ARC were run via lm-evaluation-harness.
HellaSwag and Winogrande used a custom chat-API approach since Ollama does not expose per-token logprobs.
Raw results available in the benchmarks/ folder.
- Downloads last month
- 698
Model tree for my-ai-stack/Stack-3.0-Omni-Nexus
Dataset used to train my-ai-stack/Stack-3.0-Omni-Nexus
Space using my-ai-stack/Stack-3.0-Omni-Nexus 1
Collection including my-ai-stack/Stack-3.0-Omni-Nexus
Evaluation results
- pass_at_1 on HumanEvalOpen LLM Leaderboard85.37%
- acc_norm on ARC-ChallengeOpen LLM Leaderboard83.28%
- pass_at_1 on MBPPOpen LLM Leaderboard79.80%
- acc_norm on MMLUOpen LLM Leaderboard59.89%
- acc_norm on HellaSwagOpen LLM Leaderboard59.61%
- exact_match on GSM8KOpen LLM Leaderboard52.39%
- acc on WinograndeOpen LLM Leaderboard52.01%
- mc2 on TruthfulQAOpen LLM Leaderboard45.04%