Update README.md

06ecae2 verified 6 days ago

4.11 kB

	---
	library_name: peft
	base_model: codellama/CodeLlama-7b-Instruct-hf
	tags:
	- instruction-tuning
	- qlora
	- code-llama
	- text-generation
	language:
	- en
	datasets:
	- mingyue0101/prompt_code_parquet
	- mingyue0101/prompts_modi
	license: apache-2.0
	---

	# Model Card for codellama-7b-matplotlib-assistant

	This model is a fine-tuned version of `codellama/CodeLlama-7b-Instruct-hf` designed to enhance instruction-following capabilities. It was developed as part of a Master's thesis project.

	## Model Details

	### Model Description

	The `codellama-7b-matplotlib-assistant` model is a large language model fine-tuned using the QLoRA (4-bit Quantization + LoRA) technique. The goal of this model was to adapt the base CodeLlama model to better follow user instructions while maintaining its coding and reasoning capabilities.

	- Developed by: mingyue0101
	- Model type: Causal Language Model (Fine-tuned with PEFT/LoRA)
	- Language(s) (NLP): English, Chinese
	- License: Apache-2.0 (inherited from CodeLlama)
	- Finetuned from model: codellama/CodeLlama-7b-Instruct-hf

	### Model Sources

	- Repository: https://huggingface.co/mingyue0101/codellama-7b-matplotlib-assistant
	- Dataset: https://huggingface.co/datasets/mingyue0101/prompt_code_parquet

	## Uses

	### Direct Use

	The model can be used for text generation, code assistance, and general-purpose instruction following. It is particularly suited for tasks where a balance of technical coding knowledge and conversational instruction following is required.

	### Out-of-Scope Use

	The model should not be used for high-stakes decision-making, generating malicious code, or any application that violates the safety guidelines of the base CodeLlama model.

	## Bias, Risks, and Limitations

	This model may inherit biases present in the training data or the base model. Since it was fine-tuned on a specific dataset (`parquet02`), it might exhibit limitations when handling domains outside of its training distribution. Users should expect potential hallucinations in complex reasoning tasks.

	### Recommendations

	Users are encouraged to use safety filters when deploying this model in production and to perform domain-specific evaluation before use.

	## How to Get Started with the Model

	Use the code below to load the model in 4-bit precision:

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	from peft import PeftModel

	model_id = "codellama/CodeLlama-7b-Instruct-hf"
	peft_model_id = "mingyue0101/codellama-7b-matplotlib-assistant"

	# Load 4-bit configuration
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.float16,
	)

	# Load base model and tokenizer
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	base_model = AutoModelForCausalLM.from_pretrained(
	model_id,
	quantization_config=bnb_config,
	device_map="auto"
	)

	# Load the fine-tuned adapter
	model = PeftModel.from_pretrained(base_model, peft_model_id)

	# Inference
	prompt = "Write a Python function to sort a list."
	inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
	outputs = model.generate(**inputs, max_new_tokens=128)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```
	## Training Details
	### Training Data
	The model was trained on the `mingyue0101/parquet02` dataset. This dataset contains instruction-response pairs formatted for Supervised Fine-Tuning (SFT).

	### Training Procedure
	Training Hyperparameters
	- Training regime: QLoRA 4-bit (NF4) mixed precision (fp16)
	- Learning rate: 2e-4
	- Optimizer: paged_adamw_32bit
	- Batch size: 4
	- Epochs: 1
	- LoRA Rank (r): 64
	- LoRA Alpha: 16
	- LoRA Dropout: 0.1
	- LR Scheduler: constant
	- Warmup Ratio: 0.03

	## Technical Specifications
	### Model Architecture and Objective
	Based on the Llama 2 architecture, this model utilizes grouped-query attention (GQA) and rotary positional embeddings (RoPE), fine-tuned with a causal language modeling objective.

	### Compute Infrastructure
	### Software

	- PEFT 0.10.0
	- Transformers
	- Bitsandbytes
	- TRL (SFTTrainer)