mingyue0101's picture
Update README.md
06ecae2 verified
---
library_name: peft
base_model: codellama/CodeLlama-7b-Instruct-hf
tags:
- instruction-tuning
- qlora
- code-llama
- text-generation
language:
- en
datasets:
- mingyue0101/prompt_code_parquet
- mingyue0101/prompts_modi
license: apache-2.0
---
# Model Card for codellama-7b-matplotlib-assistant
This model is a fine-tuned version of `codellama/CodeLlama-7b-Instruct-hf` designed to enhance instruction-following capabilities. It was developed as part of a Master's thesis project.
## Model Details
### Model Description
The `codellama-7b-matplotlib-assistant` model is a large language model fine-tuned using the QLoRA (4-bit Quantization + LoRA) technique. The goal of this model was to adapt the base CodeLlama model to better follow user instructions while maintaining its coding and reasoning capabilities.
- **Developed by:** mingyue0101
- **Model type:** Causal Language Model (Fine-tuned with PEFT/LoRA)
- **Language(s) (NLP):** English, Chinese
- **License:** Apache-2.0 (inherited from CodeLlama)
- **Finetuned from model:** codellama/CodeLlama-7b-Instruct-hf
### Model Sources
- **Repository:** https://huggingface.co/mingyue0101/codellama-7b-matplotlib-assistant
- **Dataset:** https://huggingface.co/datasets/mingyue0101/prompt_code_parquet
## Uses
### Direct Use
The model can be used for text generation, code assistance, and general-purpose instruction following. It is particularly suited for tasks where a balance of technical coding knowledge and conversational instruction following is required.
### Out-of-Scope Use
The model should not be used for high-stakes decision-making, generating malicious code, or any application that violates the safety guidelines of the base CodeLlama model.
## Bias, Risks, and Limitations
This model may inherit biases present in the training data or the base model. Since it was fine-tuned on a specific dataset (`parquet02`), it might exhibit limitations when handling domains outside of its training distribution. Users should expect potential hallucinations in complex reasoning tasks.
### Recommendations
Users are encouraged to use safety filters when deploying this model in production and to perform domain-specific evaluation before use.
## How to Get Started with the Model
Use the code below to load the model in 4-bit precision:
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
model_id = "codellama/CodeLlama-7b-Instruct-hf"
peft_model_id = "mingyue0101/codellama-7b-matplotlib-assistant"
# Load 4-bit configuration
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
)
# Load base model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
base_model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=bnb_config,
device_map="auto"
)
# Load the fine-tuned adapter
model = PeftModel.from_pretrained(base_model, peft_model_id)
# Inference
prompt = "Write a Python function to sort a list."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Training Details
### Training Data
The model was trained on the `mingyue0101/parquet02` dataset. This dataset contains instruction-response pairs formatted for Supervised Fine-Tuning (SFT).
### Training Procedure
**Training Hyperparameters**
- Training regime: QLoRA 4-bit (NF4) mixed precision (fp16)
- Learning rate: 2e-4
- Optimizer: paged_adamw_32bit
- Batch size: 4
- Epochs: 1
- LoRA Rank (r): 64
- LoRA Alpha: 16
- LoRA Dropout: 0.1
- LR Scheduler: constant
- Warmup Ratio: 0.03
## Technical Specifications
### Model Architecture and Objective
Based on the Llama 2 architecture, this model utilizes grouped-query attention (GQA) and rotary positional embeddings (RoPE), fine-tuned with a causal language modeling objective.
### Compute Infrastructure
### Software
- PEFT 0.10.0
- Transformers
- Bitsandbytes
- TRL (SFTTrainer)