---
license: cc-by-nc-4.0
datasets:
- flytech/python-codes-25k
language:
- en
- py
tags:
- gpt2
- gguf
- python
- code-generation
pipeline_tag: text-generation
ollama:
  template: |
    {{- if .Prompt }}## Instruction:
    {{ .Prompt }}

    ### Input:
    Processing your request...

    {{ end -}}
    ### Response:
    {{- if .Response }}{{ .Response }}{{ end -}}
  params:
    temperature: 0.1
    top_p: 0.85
    repeat_penalty: 2.0
    repeat_last_n: 32
    num_ctx: 512
    stop:
    - "## Instruction:"
    - "### Input:"
    - "### Response:"
    - "<|endoftext|>"
---

# 🚀 Indenta-13M-Python (GGUF)

An optimized from-scratch model made with a custom tokenizer and GPT-2 architecture. 
This model is built specifically for lightning-fast Python code completions and basic code generation. At ~13M parameters, it runs with near-zero latency on absolutely any hardware!

---

## 🛠️ Web UI & Multi-Turn Stability Note

This model features a hyper-lightweight 13M parameter footprint optimized for single-task completions based directly on structural templates.

Because it lacks the large capacity required for conversational context processing, it can drop formatting structure if a graphical Web UI forces a massive, multi-message chat stream into its context layers.

### For Best Performance in Web UIs:
1. **Use New Chat Threads:** Click the "New Chat" or "Clear" button in your user interface between coding tasks. This keeps the model's focus squarely on your active prompt.
2. **Synchronized Template:** This model card includes an aligned layout format (`## Instruction:` and `### Input:`) matching the training data to stop token bleeding across chat iterations.