Spaces:

build-small-hackathon
/

CodeFlow

Running

App Files Files Community

Rishi-Jain-27 commited on 3 days ago

Commit

dc88fe3

1 Parent(s): f7edb7f

Updated README

Browse files

Files changed (1) hide show

README.md +24 -10

README.md CHANGED Viewed

@@ -17,6 +17,7 @@ tags:
 - achievement:offbrand
 - achievement:llama
 - achievement:fieldnotes
 - build-small-hackathon
 - backyard-ai
 - llama-cpp
@@ -39,13 +40,14 @@ tags:
 ### 🔗 Links
-[🚀 **Live Space**][space] · [▶️ **Demo Video**][video] · [🐦 **Social Post**][social] · [📓 **Field Notes (blog)**][blog] · [🔍 **Agent Traces**][traces]
 [space]:  https://huggingface.co/spaces/build-small-hackathon/CodeFlow  "Hugging Face Space"
 [video]:  https://youtu.be/R5GbpN9FVxo  "Demo video"
 [social]: https://www.linkedin.com/feed/update/urn:li:share:7471327684539785217/  "Social post"
 [blog]:   https://huggingface.co/blog/build-small-hackathon/codeflow-field-notes  "Field notes / blog post"
 [traces]: https://huggingface.co/datasets/build-small-hackathon/codeflow-agent-traces  "Agent traces dataset"
 ---
@@ -62,7 +64,7 @@ Reading unfamiliar code means simulating its control flow in your head — chasi
                                     │
                     number the source lines + structured system prompt
                                     │
-                     Qwen3-Coder-30B-A3B   (llama.cpp · CPU)
                                     │
                  <thinking> …reasoning… </thinking>
                  graph TD … nodes & edges …
@@ -76,19 +78,30 @@ Reading unfamiliar code means simulating its control flow in your head — chasi
 ```
 1. You paste code (or pick a pre-rendered example) into the **CodeMirror** editor and hit **Generate**.
-2. The backend numbers the source lines and sends them with a strict system prompt to **Qwen3-Coder** running on **llama.cpp**.
 3. The model returns hidden `<thinking>`, the Mermaid `graph`, and a `<linemap>` mapping every node to its source line(s).
 4. The server strips the reasoning, **validates** the line-map against the source, sanitizes labels for Mermaid, and returns `{ mermaid, linemap }`.
 5. The frontend renders the diagram with a **trace-the-path reveal** that flows out of a persistent Start node while the canvas scrolls along in real time.
 6. **Node ↔ code linking:** hover a node to highlight its source lines, click a node to jump-and-edit them, or move your cursor over a line to light up the matching node.
 7. Every generation is captured as a structured **agent trace** (`/traces`).
 ## 🧰 Tech Stack
 | Layer | What it is | Used for |
 |---|---|---|
-| **Model** | [Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen) (Mixture-of-Experts) | Code → Mermaid + line-map generation |
-| **Quantization** | [Unsloth](https://huggingface.co/unsloth) Dynamic **UD-Q3_K_XL** GGUF (~3-bit) | Shrinks the 30B model to run on CPU |
 | **Inference** | [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) (llama.cpp) | Local CPU inference (`n_ctx=4096`) |
 | **Model fetch** | `huggingface_hub` | Downloads the GGUF on first run |
 | **Server** | [Gradio](https://www.gradio.app/) `gr.Server` + FastAPI | `/generate_flowchart` API, `/` UI, `/traces` |
@@ -104,14 +117,14 @@ Reading unfamiliar code means simulating its control flow in your head — chasi
 ## 🔢 Total Parameters
-CodeFlow is driven by **Qwen3-Coder-30B-A3B-Instruct** — a **Mixture-of-Experts** model with:
-- **≈ 30.5 billion total parameters**
 - **≈ 3.3 billion active parameters per token** (128 experts, 8 activated)
-It's served as an **Unsloth Dynamic ~3-bit (UD-Q3_K_XL) GGUF**, which compresses those 30B weights to a CPU-runnable footprint (~13 GB on disk) — letting a 30B-class model generate diagrams **off the grid**, with no GPU and no external API.
-## 🏅 Badges (5 / 6)
 These map to the Space tags above.
@@ -122,6 +135,7 @@ These map to the Space tags above.
 | 📓 **Field Notes** | See the [blog post][blog]. |
 | 🤝 **Sharing is Caring** | Open-source under **MIT**, a public Space, plus a [social post][social] sharing the process and learnings. |
 | 🤖 **Agentic** | Every model generation is captured as a structured agent trace (input code, the model's reasoning, output, token usage, latency), downloadable at [`/traces`][traces]. |
 ## 🎥 Demo
@@ -187,7 +201,7 @@ CodeFlow/
 ## 🙏 Credits
-- **Model:** [Qwen3-Coder](https://huggingface.co/Qwen) (Qwen Team, Alibaba) — GGUF quant by [Unsloth](https://huggingface.co/unsloth).
 - **Inference:** [llama.cpp](https://github.com/ggml-org/llama.cpp) via [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) (Andrei Betlen).
 - **App framework:** [Gradio](https://www.gradio.app/) (Hugging Face).
 - **Diagrams:** [Mermaid.js](https://mermaid.js.org/) · **Editor:** [CodeMirror](https://codemirror.net/).

 - achievement:offbrand
 - achievement:llama
 - achievement:fieldnotes
+- achievement:welltuned
 - build-small-hackathon
 - backyard-ai
 - llama-cpp
 ### 🔗 Links
+[🚀 **Live Space**][space] · [▶️ **Demo Video**][video] · [🐦 **Social Post**][social] · [📓 **Field Notes (blog)**][blog] · [🔍 **Agent Traces**][traces] · [🎛️ **Fine-Tuned Model**][model]
 [space]:  https://huggingface.co/spaces/build-small-hackathon/CodeFlow  "Hugging Face Space"
 [video]:  https://youtu.be/R5GbpN9FVxo  "Demo video"
 [social]: https://www.linkedin.com/feed/update/urn:li:share:7471327684539785217/  "Social post"
 [blog]:   https://huggingface.co/blog/build-small-hackathon/codeflow-field-notes  "Field notes / blog post"
 [traces]: https://huggingface.co/datasets/build-small-hackathon/codeflow-agent-traces  "Agent traces dataset"
+[model]:  https://huggingface.co/build-small-hackathon/codeflow-qwen-3-finetuning  "Fine-tuned model"
 ---
                                     │
                     number the source lines + structured system prompt
                                     │
+          CodeFlow fine-tune of Qwen3-Coder-30B-A3B  (llama.cpp · CPU)
                                     │
                  <thinking> …reasoning… </thinking>
                  graph TD … nodes & edges …
 ```
 1. You paste code (or pick a pre-rendered example) into the **CodeMirror** editor and hit **Generate**.
+2. The backend numbers the source lines and sends them with a strict system prompt to the **CodeFlow fine-tune of Qwen3-Coder** running on **llama.cpp**.
 3. The model returns hidden `<thinking>`, the Mermaid `graph`, and a `<linemap>` mapping every node to its source line(s).
 4. The server strips the reasoning, **validates** the line-map against the source, sanitizes labels for Mermaid, and returns `{ mermaid, linemap }`.
 5. The frontend renders the diagram with a **trace-the-path reveal** that flows out of a persistent Start node while the canvas scrolls along in real time.
 6. **Node ↔ code linking:** hover a node to highlight its source lines, click a node to jump-and-edit them, or move your cursor over a line to light up the matching node.
 7. Every generation is captured as a structured **agent trace** (`/traces`).
+## 🎛️ Fine-Tuning
+CodeFlow runs a [**LoRA fine-tune**][model] of **Qwen3-Coder-30B-A3B-Instruct** (≈30.5B params), specialized for the code → Mermaid + `<linemap>` task rather than relying on the base model's general coding ability.
+- **Data:** **2,400 synthetic examples** (2,208 train / 192 val — 8% holdout), built from **22 control-flow templates** across **Python, JavaScript, C++, and C**.
+- **Method:** LoRA `r=16, α=32` on the attention + MLP projections, **bf16**, cosine schedule — then merged and exported to a **Q3_K_L GGUF** for CPU inference.
+- **Validation:** the holdout is **hard-validated** — generated outputs are syntax-checked / compiled, not just eyeballed.
+See the [model card][model] for the full data engine, `finetune.py` options, and dataset preview.
 ## 🧰 Tech Stack
 | Layer | What it is | Used for |
 |---|---|---|
+| **Model** | [**CodeFlow fine-tune**][model] of [Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen) (Mixture-of-Experts) | Code → Mermaid + line-map generation |
+| **Fine-tuning** | LoRA SFT (`r=16, α=32`) on attention + MLP projections, merged to GGUF | Specializes the base model for the code → Mermaid + line-map task |
+| **Quantization** | **Q3_K_L** GGUF (~3-bit) | Shrinks the 30B model to run on CPU |
 | **Inference** | [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) (llama.cpp) | Local CPU inference (`n_ctx=4096`) |
 | **Model fetch** | `huggingface_hub` | Downloads the GGUF on first run |
 | **Server** | [Gradio](https://www.gradio.app/) `gr.Server` + FastAPI | `/generate_flowchart` API, `/` UI, `/traces` |
 ## 🔢 Total Parameters
+CodeFlow is driven by a [**LoRA fine-tune**][model] of **Qwen3-Coder-30B-A3B-Instruct** — a **Mixture-of-Experts** model with:
+- **≈ 30.5 billion total parameters** (well under the 32B cap)
 - **≈ 3.3 billion active parameters per token** (128 experts, 8 activated)
+It's served as a **~3-bit (Q3_K_L) GGUF**, which compresses those 30B weights to a CPU-runnable footprint (~13 GB on disk) — letting a 30B-class model generate diagrams **off the grid**, with no GPU and no external API.
+## 🏅 Badges (6 / 6)
 These map to the Space tags above.
 | 📓 **Field Notes** | See the [blog post][blog]. |
 | 🤝 **Sharing is Caring** | Open-source under **MIT**, a public Space, plus a [social post][social] sharing the process and learnings. |
 | 🤖 **Agentic** | Every model generation is captured as a structured agent trace (input code, the model's reasoning, output, token usage, latency), downloadable at [`/traces`][traces]. |
+| 🎛️ **Well-Tuned** | A [**LoRA fine-tune**][model] of Qwen3-Coder-30B-A3B-Instruct (**≈30.5B params — under the 32B cap**), specialized for the code → Mermaid + `<linemap>` task and shipped as the GGUF the Space actually runs. |
 ## 🎥 Demo
 ## 🙏 Credits
+- **Model:** [CodeFlow fine-tune][model] of [Qwen3-Coder](https://huggingface.co/Qwen) (Qwen Team, Alibaba), built with [Unsloth](https://huggingface.co/unsloth).
 - **Inference:** [llama.cpp](https://github.com/ggml-org/llama.cpp) via [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) (Andrei Betlen).
 - **App framework:** [Gradio](https://www.gradio.app/) (Hugging Face).
 - **Diagrams:** [Mermaid.js](https://mermaid.js.org/) · **Editor:** [CodeMirror](https://codemirror.net/).