---
library_name: peft
base_model: codellama/CodeLlama-7b-Instruct-hf
tags:
- instruction-tuning
- qlora
- code-llama
- text-generation
language:
- en
datasets:
- mingyue0101/prompt_code_parquet
- mingyue0101/prompts_modi
license: apache-2.0
---

# Model Card for codellama-7b-matplotlib-assistant

This model is a fine-tuned version of `codellama/CodeLlama-7b-Instruct-hf` designed to enhance instruction-following capabilities. It was developed as part of a Master's thesis project.

## Model Details

### Model Description

The `codellama-7b-matplotlib-assistant` model is a large language model fine-tuned using the QLoRA (4-bit Quantization + LoRA) technique. The goal of this model was to adapt the base CodeLlama model to better follow user instructions while maintaining its coding and reasoning capabilities.

- **Developed by:** mingyue0101
- **Model type:** Causal Language Model (Fine-tuned with PEFT/LoRA)
- **Language(s) (NLP):** English, Chinese
- **License:** Apache-2.0 (inherited from CodeLlama)
- **Finetuned from model:** codellama/CodeLlama-7b-Instruct-hf

### Model Sources

- **Repository:** https://huggingface.co/mingyue0101/codellama-7b-matplotlib-assistant
- **Dataset:** https://huggingface.co/datasets/mingyue0101/prompt_code_parquet

## Uses

### Direct Use

The model can be used for text generation, code assistance, and general-purpose instruction following. It is particularly suited for tasks where a balance of technical coding knowledge and conversational instruction following is required.

### Out-of-Scope Use

The model should not be used for high-stakes decision-making, generating malicious code, or any application that violates the safety guidelines of the base CodeLlama model.

## Bias, Risks, and Limitations

This model may inherit biases present in the training data or the base model. Since it was fine-tuned on a specific dataset (`parquet02`), it might exhibit limitations when handling domains outside of its training distribution. Users should expect potential hallucinations in complex reasoning tasks.

### Recommendations

Users are encouraged to use safety filters when deploying this model in production and to perform domain-specific evaluation before use.

## How to Get Started with the Model

Use the code below to load the model in 4-bit precision:

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel

model_id = "codellama/CodeLlama-7b-Instruct-hf"
peft_model_id = "mingyue0101/codellama-7b-matplotlib-assistant"

# Load 4-bit configuration
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
)

# Load base model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
base_model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    quantization_config=bnb_config, 
    device_map="auto"
)

# Load the fine-tuned adapter
model = PeftModel.from_pretrained(base_model, peft_model_id)

# Inference
prompt = "Write a Python function to sort a list."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Training Details
### Training Data
The model was trained on the `mingyue0101/parquet02` dataset. This dataset contains instruction-response pairs formatted for Supervised Fine-Tuning (SFT).

### Training Procedure
**Training Hyperparameters**
- Training regime: QLoRA 4-bit (NF4) mixed precision (fp16)
- Learning rate: 2e-4
- Optimizer: paged_adamw_32bit
- Batch size: 4
- Epochs: 1
- LoRA Rank (r): 64
- LoRA Alpha: 16
- LoRA Dropout: 0.1
- LR Scheduler: constant
- Warmup Ratio: 0.03

## Technical Specifications
### Model Architecture and Objective
Based on the Llama 2 architecture, this model utilizes grouped-query attention (GQA) and rotary positional embeddings (RoPE), fine-tuned with a causal language modeling objective.

### Compute Infrastructure
### Software

- PEFT 0.10.0
- Transformers
- Bitsandbytes
- TRL (SFTTrainer)