Indenta-13M-Python / README.md
Rohanify's picture
Update README.md
050df37 verified
metadata
license: cc-by-nc-4.0
datasets:
  - flytech/python-codes-25k
language:
  - en
  - py
tags:
  - gpt2
  - gguf
  - python
  - code-generation
pipeline_tag: text-generation
ollama:
  template: |
    {{- if .Prompt }}## Instruction:
    {{ .Prompt }}

    ### Input:
    Processing your request...

    {{ end -}}
    ### Response:
    {{- if .Response }}{{ .Response }}{{ end -}}
  params:
    temperature: 0.1
    top_p: 0.85
    repeat_penalty: 2
    repeat_last_n: 32
    num_ctx: 512
    stop:
      - '## Instruction:'
      - '### Input:'
      - '### Response:'
      - <|endoftext|>

๐Ÿš€ Indenta-13M-Python (GGUF)

An optimized from-scratch model made with a custom tokenizer and GPT-2 architecture. This model is built specifically for lightning-fast Python code completions and basic code generation. At ~13M parameters, it runs with near-zero latency on absolutely any hardware!


๐Ÿ› ๏ธ Web UI & Multi-Turn Stability Note

This model features a hyper-lightweight 13M parameter footprint optimized for single-task completions based directly on structural templates.

Because it lacks the large capacity required for conversational context processing, it can drop formatting structure if a graphical Web UI forces a massive, multi-message chat stream into its context layers.

For Best Performance in Web UIs:

  1. Use New Chat Threads: Click the "New Chat" or "Clear" button in your user interface between coding tasks. This keeps the model's focus squarely on your active prompt.
  2. Synchronized Template: This model card includes an aligned layout format (## Instruction: and ### Input:) matching the training data to stop token bleeding across chat iterations.