--- license: mit --- # Model Card # Qwen2-0.5B-Python-SFT (LoRA) ## Overview This model is a Supervised Fine-Tuned (SFT) version of **Qwen/Qwen2-0.5B**, adapted for Python instruction-following tasks. The fine-tuning was performed using QLoRA (4-bit quantization + LoRA adapters) on a curated Python instruction dataset to improve structured code generation and instruction alignment. This repository contains **LoRA adapter weights**, not the full base model. ## Base Model * Base: `Qwen/Qwen2-0.5B` * Architecture: Decoder-only Transformer * Parameters: 0.5B * License: Refer to original Qwen license Base model must be loaded separately. ## Training Dataset * Dataset: `iamtarun/python_code_instructions_18k_alpaca` * Size: ~18,000 instruction-output pairs * Format: Alpaca-style instruction → response * Domain: Python programming tasks Each training sample followed: ``` Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: ... ### Response: ... ``` ## Training Details * Method: QLoRA (4-bit) * Quantization: NF4 * Compute dtype: FP16 * Optimizer: paged_adamw_8bit * Sequence length: 384–512 * Epochs: 1 * Final training loss: ~0.2–0.3 * Hardware: Tesla P100 (16GB) * Frameworks: * transformers * peft * trl * bitsandbytes ## Intended Use This model is designed for: * Python code generation * Simple algorithm implementation * Educational coding tasks * Instruction-following code responses It performs best when prompted in Alpaca-style format: ``` Below is an instruction that describes a task. ### Instruction: Write a Python function to reverse a linked list. ### Response: ``` ## How to Use ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B") tokenizer = AutoTokenizer.from_pretrained("NNEngine/qwen2-0.5b-python-lora") model = PeftModel.from_pretrained(base_model, "NNEngine/qwen2-0.5b-python-lora") model.eval() ``` Example generation: ```python prompt = """Below is an instruction that describes a task. ### Instruction: Write a Python function to check if a number is prime. ### Response: """ ``` ## Observed Behavior The model demonstrates: * Improved Python code structuring * Better adherence to instruction-response formatting * Faster convergence for common programming tasks Limitations: * Small model size (0.5B) limits reasoning depth * May hallucinate under high-temperature decoding * Works best with explicit language specification ("Write a Python function") ## Limitations * Not suitable for production-critical systems * Limited mathematical and multi-step reasoning capability * Sensitive to prompt formatting * Performance depends heavily on decoding strategy ## Future Improvements Potential enhancements: * Mask instruction tokens during SFT * Increase model size (1.5B+) * Train on more diverse programming datasets * Evaluate with pass@k benchmarks ## Acknowledgements * Base model by Qwen team * Dataset by `iamtarun`