fix: restore previous model card with tools showcase

Browse files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (5) hide show

CHANGELOG.md +12 -0
MODEL_CARD.md +225 -0
src/tools/base.py +14 -14
src/tools/glob_tool.py +1 -0
src/tools/registry.py +20 -0

CHANGELOG.md ADDED Viewed

	@@ -0,0 +1,12 @@

+# Changelog
+All notable changes will be documented in this file.
+## [1.0.0] - 2026-03-30
+### Added
+- Initial release
+- Gradio web interface
+- Docker support
+- GitHub Actions CI/CD
+- Test suite
+- Documentation

MODEL_CARD.md ADDED Viewed

	@@ -0,0 +1,225 @@

+<p align="center">
+  <a href="https://github.com/my-ai-stack/stack-2.9">
+    <img src="https://img.shields.io/badge/GitHub-View%20Repo-blue?style=flat-square&logo=github" alt="GitHub">
+  </a>
+  <a href="https://huggingface.co/spaces/my-ai-stack/stack-2-9-demo">
+    <img src="https://img.shields.io/badge/HF%20Space-Demo-green?style=flat-square&logo=huggingface" alt="HuggingFace Space">
+  </a>
+  <img src="https://img.shields.io/badge/Parameters-1.5B-purple?style=flat-square" alt="Parameters">
+  <img src="https://img.shields.io/badge/Context-32K-orange?style=flat-square" alt="Context">
+  <img src="https://img.shields.io/badge/License-Apache%202.0-yellow?style=flat-square" alt="License">
+</p>
+---
+# Stack 2.9
+> A fine-tuned code assistant built on Qwen2.5-Coder-1.5B, trained on Stack Overflow data
+Stack 2.9 is a specialized code generation model fine-tuned from [Qwen/Qwen2.5-Coder-1.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B) on Stack Overflow Q&A data for improved programming assistance.
+## Key Features
+- **Specialized for Code**: Trained on Stack Overflow patterns for better code generation
+- **32K Context**: Handle larger codebases and complex documentation
+- **Efficient**: Runs on consumer GPUs (RTX 3060+)
+- **Open Source**: Apache 2.0 licensed
+---
+## Model Details
+| Attribute | Value |
+|-----------|-------|
+| **Base Model** | Qwen/Qwen2.5-Coder-1.5B |
+| **Parameters** | 1.5B |
+| **Context Length** | 32,768 tokens |
+| **Fine-tuning Method** | LoRA (Rank 8) |
+| **Precision** | FP16 |
+| **License** | Apache 2.0 |
+| **Release Date** | April 2026 |
+### Architecture
+| Specification | Value |
+|--------------|-------|
+| Architecture | Qwen2ForCausalLM |
+| Hidden Size | 1,536 |
+| Num Layers | 28 |
+| Attention Heads | 12 (Q) / 2 (KV) |
+| GQA | Yes (2 KV heads) |
+| Intermediate Size | 8,960 |
+| Vocab Size | 151,936 |
+| Activation | SiLU (SwiGLU) |
+| Normalization | RMSNorm |
+---
+## Quickstart
+### Installation
+```bash
+pip install transformers>=4.40.0 torch>=2.0.0 accelerate
+```
+### Code Example
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "my-ai-stack/Stack-2-9-finetuned"
+# Load model and tokenizer
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype="auto",
+    device_map="auto"
+)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+# Chat interface
+messages = [
+    {"role": "system", "content": "You are Stack 2.9, a helpful coding assistant."},
+    {"role": "user", "content": "Write a Python function to calculate fibonacci numbers"}
+]
+# Apply chat template
+text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True
+)
+# Generate
+model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
+generated_ids = model.generate(
+    **model_inputs,
+    max_new_tokens=512,
+    temperature=0.7,
+    do_sample=True
+)
+# Decode response
+response = tokenizer.decode(
+    generated_ids[0][len(model_inputs.input_ids[0]):],
+    skip_special_tokens=True
+)
+print(response)
+```
+### Interactive Chat
+```bash
+python chat.py
+```
+---
+## Training Details
+| Specification | Value |
+|--------------|-------|
+| **Method** | LoRA (Low-Rank Adaptation) |
+| **LoRA Rank** | 8 |
+| **LoRA Alpha** | 16 |
+| **Target Modules** | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
+| **Epochs** | ~0.8 |
+| **Final Loss** | 0.0205 |
+| **Data Source** | Stack Overflow Q&A |
+### Training Data
+Fine-tuned on Stack Overflow code Q&A pairs including:
+- Python code solutions and snippets
+- Code explanations and documentation
+- Programming patterns and best practices
+- Bug fixes and debugging examples
+- Algorithm implementations
+---
+## Evaluation
+| Benchmark | Score | Notes |
+|-----------|-------|-------|
+| **HumanEval** | ~35-40% | Based on base model benchmarks |
+| **MBPP** | ~40-45% | Python-focused evaluation |
+> **Note**: Full benchmark evaluation is in progress. The model inherits strong coding capabilities from Qwen2.5-Coder and is specialized for Stack Overflow patterns.
+---
+## Hardware Requirements
+| Configuration | GPU | VRAM |
+|---------------|-----|------|
+| FP16 | RTX 3060+ | ~4GB |
+| 8-bit | RTX 3060+ | ~2GB |
+| 4-bit | Any modern GPU | ~1GB |
+| CPU | None | ~8GB RAM |
+---
+## Capabilities
+- **Code Generation**: Python, JavaScript, TypeScript, SQL, Go, Rust, and more
+- **Code Completion**: Functions, classes, and entire snippets
+- **Debugging**: Identify and fix bugs with explanations
+- **Code Explanation**: Document and explain code behavior
+- **Programming Q&A**: Answer technical questions
+---
+## Limitations
+- **Model Size**: At 1.5B parameters, smaller than state-of-the-art models (7B+)
+- **Training Data**: Python-heavy; other languages may have lower quality
+- **Hallucinations**: May occasionally generate incorrect code; verification recommended
+- **Tool Use**: Base model without native tool-calling (see enhanced version)
+---
+## Comparison
+| Feature | Qwen2.5-Coder-1.5B | Stack 2.9 |
+|---------|-------------------|-----------|
+| Code Generation | General | Stack Overflow patterns |
+| Python Proficiency | Baseline | Enhanced |
+| Context Length | 32K | 32K |
+| Specialization | General code | Stack Overflow Q&A |
+---
+## Citation
+```bibtex
+@misc{my-ai-stack/stack-2-9-finetuned,
+  author = {Walid Sobhi},
+  title = {Stack 2.9: Fine-tuned Qwen2.5-Coder-1.5B on Stack Overflow Data},
+  year = {2026},
+  publisher = {HuggingFace},
+  url = {https://huggingface.co/my-ai-stack/Stack-2-9-finetuned}
+}
+```
+---
+## Related Links
+- [GitHub Repository](https://github.com/my-ai-stack/stack-2.9)
+- [HuggingFace Space Demo](https://huggingface.co/spaces/my-ai-stack/stack-2-9-demo)
+- [Base Model](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B)
+- [Qwen2.5-Coder-7B](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
+- [Qwen2.5-Coder-32B](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct)
+---
+## License
+Licensed under the Apache 2.0 license. See [LICENSE](LICENSE) for details.
+---
+*Model Card Version: 2.0*
+*Last Updated: April 2026*

src/tools/base.py CHANGED Viewed

@@ -80,7 +80,8 @@ class BaseTool(ABC, Generic[TInput, TOutput]):
     def call(self, input_data: dict[str, Any]) -> ToolResult[TOutput]:
         """High-level call wrapper: validate → execute → timing.
-        Handles both sync and async execute methods.
         """
         valid, error = self.validate_input(input_data)
         if not valid:
@@ -88,21 +89,20 @@ class BaseTool(ABC, Generic[TInput, TOutput]):
         start = time.perf_counter()
         try:
-            result = self.execute(input_data)
             # Handle async execute methods
             if inspect.iscoroutine(result):
-                try:
-                    loop = asyncio.get_event_loop()
-                    if loop.is_running():
-                        # If we're already in an async context, we can't use run_until_complete
-                        # Fall back to creating a new task (for contexts where this matters)
-                        # For most cases, creating a new loop in a sync call is fine
-                        result = asyncio.run(result)
-                    else:
-                        result = loop.run_until_complete(result)
-                except RuntimeError:
-                    # No event loop running, create one
-                    result = asyncio.run(result)
             result.duration_seconds = time.perf_counter() - start
             return result
         except Exception as exc:

     def call(self, input_data: dict[str, Any]) -> ToolResult[TOutput]:
         """High-level call wrapper: validate → execute → timing.
+        Handles both sync and async execute methods, and both
+        execute(input_data: dict) and execute(path: str, ...) signatures.
         """
         valid, error = self.validate_input(input_data)
         if not valid:
         start = time.perf_counter()
         try:
+            # Determine if execute takes a dict or named parameters
+            sig = inspect.signature(self.execute)
+            params = list(sig.parameters.keys())
+            # If first param is 'input_data' (and only one param), pass dict directly
+            # Otherwise unpack as kwargs
+            if params == ['input_data']:
+                result = self.execute(input_data)
+            else:
+                result = self.execute(**input_data)
             # Handle async execute methods
             if inspect.iscoroutine(result):
+                result = asyncio.run(result)
             result.duration_seconds = time.perf_counter() - start
             return result
         except Exception as exc:

src/tools/glob_tool.py CHANGED Viewed

@@ -2,6 +2,7 @@
 import fnmatch
 import os
 from pathlib import Path
 from typing import Any, Dict, List, Optional

 import fnmatch
 import os
+import re
 from pathlib import Path
 from typing import Any, Dict, List, Optional

src/tools/registry.py CHANGED Viewed

@@ -33,6 +33,26 @@ class ToolRegistry:
         """List all registered tool names."""
         return list(self._tools.keys())
     def call(self, name: str, input_data: dict[str, Any]) -> Any:
         """Convenience: get tool and call it in one step."""
         tool = self.get(name)

         """List all registered tool names."""
         return list(self._tools.keys())
+    def list_tools(self) -> dict[str, dict[str, Any]]:
+        """List all registered tools with their info.
+        Returns a dict mapping tool name to info dict with keys:
+        - name: str
+        - description: str
+        - input_schema: dict
+        """
+        result = {}
+        for name, tool in self._tools.items():
+            schema = tool.input_schema
+            if callable(schema):
+                schema = schema()
+            result[name] = {
+                "name": tool.name,
+                "description": tool.description,
+                "input_schema": schema,
+            }
+        return result
     def call(self, name: str, input_data: dict[str, Any]) -> Any:
         """Convenience: get tool and call it in one step."""
         tool = self.get(name)