| | --- |
| | license: mit |
| | --- |
| | |
| | # Model Card |
| |
|
| |
|
| | # Qwen2-0.5B-Python-SFT (LoRA) |
| |
|
| | ## Overview |
| |
|
| | This model is a Supervised Fine-Tuned (SFT) version of **Qwen/Qwen2-0.5B**, adapted for Python instruction-following tasks. |
| |
|
| | The fine-tuning was performed using QLoRA (4-bit quantization + LoRA adapters) on a curated Python instruction dataset to improve structured code generation and instruction alignment. |
| |
|
| | This repository contains **LoRA adapter weights**, not the full base model. |
| |
|
| |
|
| | ## Base Model |
| |
|
| | * Base: `Qwen/Qwen2-0.5B` |
| | * Architecture: Decoder-only Transformer |
| | * Parameters: 0.5B |
| | * License: Refer to original Qwen license |
| |
|
| | Base model must be loaded separately. |
| |
|
| |
|
| | ## Training Dataset |
| |
|
| | * Dataset: `iamtarun/python_code_instructions_18k_alpaca` |
| | * Size: ~18,000 instruction-output pairs |
| | * Format: Alpaca-style instruction → response |
| | * Domain: Python programming tasks |
| |
|
| | Each training sample followed: |
| |
|
| | ``` |
| | Below is an instruction that describes a task. |
| | Write a response that appropriately completes the request. |
| | |
| | ### Instruction: |
| | ... |
| | |
| | ### Response: |
| | ... |
| | ``` |
| |
|
| |
|
| | ## Training Details |
| |
|
| | * Method: QLoRA (4-bit) |
| | * Quantization: NF4 |
| | * Compute dtype: FP16 |
| | * Optimizer: paged_adamw_8bit |
| | * Sequence length: 384–512 |
| | * Epochs: 1 |
| | * Final training loss: ~0.2–0.3 |
| | * Hardware: Tesla P100 (16GB) |
| | * Frameworks: |
| |
|
| | * transformers |
| | * peft |
| | * trl |
| | * bitsandbytes |
| |
|
| |
|
| | ## Intended Use |
| |
|
| | This model is designed for: |
| |
|
| | * Python code generation |
| | * Simple algorithm implementation |
| | * Educational coding tasks |
| | * Instruction-following code responses |
| |
|
| | It performs best when prompted in Alpaca-style format: |
| |
|
| | ``` |
| | Below is an instruction that describes a task. |
| | |
| | ### Instruction: |
| | Write a Python function to reverse a linked list. |
| | |
| | ### Response: |
| | ``` |
| |
|
| |
|
| | ## How to Use |
| |
|
| | ```python |
| | import torch |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | from peft import PeftModel |
| | |
| | base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B") |
| | tokenizer = AutoTokenizer.from_pretrained("NNEngine/qwen2-0.5b-python-lora") |
| | |
| | model = PeftModel.from_pretrained(base_model, "NNEngine/qwen2-0.5b-python-lora") |
| | |
| | model.eval() |
| | ``` |
| |
|
| | Example generation: |
| |
|
| | ```python |
| | prompt = """Below is an instruction that describes a task. |
| | |
| | ### Instruction: |
| | Write a Python function to check if a number is prime. |
| | |
| | ### Response: |
| | """ |
| | ``` |
| |
|
| |
|
| | ## Observed Behavior |
| |
|
| | The model demonstrates: |
| |
|
| | * Improved Python code structuring |
| | * Better adherence to instruction-response formatting |
| | * Faster convergence for common programming tasks |
| |
|
| | Limitations: |
| |
|
| | * Small model size (0.5B) limits reasoning depth |
| | * May hallucinate under high-temperature decoding |
| | * Works best with explicit language specification ("Write a Python function") |
| |
|
| |
|
| | ## Limitations |
| |
|
| | * Not suitable for production-critical systems |
| | * Limited mathematical and multi-step reasoning capability |
| | * Sensitive to prompt formatting |
| | * Performance depends heavily on decoding strategy |
| |
|
| | ## Future Improvements |
| |
|
| | Potential enhancements: |
| |
|
| | * Mask instruction tokens during SFT |
| | * Increase model size (1.5B+) |
| | * Train on more diverse programming datasets |
| | * Evaluate with pass@k benchmarks |
| |
|
| |
|
| | ## Acknowledgements |
| |
|
| | * Base model by Qwen team |
| | * Dataset by `iamtarun` |
| |
|
| |
|