| --- |
| datasets: |
| - nvidia/Nemotron-CC-v2 |
| - nvidia/Nemotron-Post-Training-Dataset-v2 |
| - nvidia/Nemotron-Instruction-Following-Chat-v1 |
| - nvidia/Nemotron-Science-v1 |
| - nvidia/Nemotron-Agentic-v1 |
| - nvidia/Nemotron-Competitive-Programming-v1 |
| - nvidia/Nemotron-Math-Proofs-v1 |
| - nvidia/Nemotron-RL-Agentic-Conversational-Tool-Use-Pivot-v1 |
| - nvidia/Nemotron-RL-instruction_following |
| - nvidia/Nemotron-RL-agent-calendar_scheduling |
| - nvidia/Nemotron-RL-instruction_following-structured_outputs |
| --- |
| Nvidia.Agentic.Coder-4B-GGUF |
|
|
| 📌 Model Overview |
|
|
| Model Name: WithinUsAI/Nvidia.Agentic.Coder-4B-GGUF |
| Organization: Within Us AI |
| Model Type: Code LLM (Agentic, Instruction-Following) |
| Parameter Size: 4B |
| Format: GGUF (quantized for local inference) |
| Primary Use: Agentic coding, tool-using workflows, software engineering reasoning |
|
|
| This model is part of the Within Us AI ecosystem focused on building agentic, reasoning-driven coding systems designed to think, act, and verify like real engineers.  |
|
|
| ⸻ |
|
|
| 🧬 Architecture & Lineage |
|
|
| * Base Family: NVIDIA Nemotron-style 4B class models (inferred lineage from naming + ecosystem alignment) |
| * Format Conversion: GGUF quantization for efficient local inference |
| * Training Approach: |
| * Instruction-tuned for coding tasks |
| * Agentic workflow emphasis (multi-step reasoning, tool usage) |
| * Likely merged / fine-tuned using Within Us AI proprietary pipelines |
|
|
| Related ecosystem models include: |
|
|
| * NVIDIA-Nemotron-3-Nano-4B |
| * Other 4B agentic coders and merges in the same class  |
|
|
| ⸻ |
|
|
| ⚙️ Key Capabilities |
|
|
| 🧑💻 Code Intelligence |
|
|
| * Multi-language code generation |
| * Bug fixing and refactoring |
| * Structured output generation |
|
|
| 🤖 Agentic Behavior |
|
|
| * Step-by-step reasoning |
| * Task decomposition |
| * Tool-calling alignment (design goal) |
|
|
| 🧠 Reasoning Focus |
|
|
| * Instruction-following with logical chaining |
| * Designed for evaluation-style datasets (tests-as-truth philosophy) |
|
|
| ⸻ |
|
|
| 📦 GGUF Quantization |
|
|
| GGUF allows efficient local inference with tools like: |
|
|
| * llama.cpp |
| * LM Studio |
| * Ollama (GGUF-compatible builds) |
|
|
| Typical quantizations for 4B GGUF models include: |
|
|
| * Q2_K (~1.8GB) |
| * Q3_K (~2.0–2.3GB) |
| * Q4_K (~2.5GB, recommended balance)  |
| |
| ⸻ |
| |
| 🚀 Intended Use |
| |
| ✅ Ideal Use Cases |
| |
| * Local AI coding assistants |
| * Autonomous coding agents |
| * SWE-bench style evaluation |
| * Tool-augmented workflows |
| * Offline developer copilots |
| |
| ⚠️ Limitations |
| |
| * Smaller 4B parameter size limits deep reasoning vs larger models |
| * Performance depends heavily on prompt structure |
| * Tool-use requires external orchestration (not built-in runtime) |
| |
| ⸻ |
| |
| 🛠️ Usage Example (llama.cpp) |
| |
| ./main -m Nvidia.Agentic.Coder-4B.Q4_K.gguf \ |
| -p "Write a Python function to parse JSON logs and extract errors." \ |
| -n 512 |
|
|
| ⸻ |
|
|
| 🧪 Training Philosophy (Within Us AI) |
|
|
| Within Us AI focuses on: |
|
|
| * Agentic AI systems |
| * Test-driven training (tests-as-truth) |
| * Diff-first patching workflows |
| * Secure and auditable code generation |
| * Evaluation-first development pipelines  |
|
|
| ⸻ |
|
|
| 📊 Evaluation |
|
|
| No formal benchmark results published yet. |
|
|
| Expected strengths: |
|
|
| * Strong instruction adherence |
| * Lightweight agentic reasoning |
| * Efficient local deployment |
|
|
| ⸻ |
|
|
| 📚 Datasets & Training Sources |
|
|
| This model follows the Within Us AI methodology: |
|
|
| * Proprietary datasets created by Within Us AI |
| * May include third-party datasets for training (no ownership claimed) |
| * Emphasis on: |
| * Code reasoning traces |
| * Agentic workflows |
| * Evaluation-driven samples |
|
|
| ⸻ |
|
|
| 📜 License |
|
|
| License Type: Custom / Other (Within Us AI License) |
|
|
| Terms: |
|
|
| * Within Us AI created the fine-tuning, merging, and training methodology |
| * Base model architecture originates from third-party LLM ecosystems (e.g., NVIDIA / Nemotron class) |
| * Third-party datasets may be used without claiming ownership |
| * Full credit and acknowledgment belong to original dataset and base model creators |
|
|
| ⸻ |
|
|
| 🙏 Acknowledgements |
|
|
| Special thanks to: |
|
|
| * NVIDIA Nemotron ecosystem contributors |
| * Open-source GGUF tooling community |
| * Dataset creators across Hugging Face |
| * The broader open-source AI research community |
|
|
| ⸻ |
|
|
| 🔗 Links |
|
|
| * Model: https://huggingface.co/WithinUsAI/Nvidia.Agentic.Coder-4B-GGUF |
| * Organization: https://huggingface.co/WithinUsAI |