File size: 3,533 Bytes

f6b5b49
 
2b02bf2
 
d14d7d7
 
 
 
 
f6b5b49
d14d7d7
 
 
 
2b02bf2
d14d7d7
b85f178
 
f6b5b49
 
d14d7d7
f6b5b49
0b35aa3
e24ed34
f6b5b49
d14d7d7
f6b5b49
d14d7d7
f6b5b49
d14d7d7
f6b5b49
d14d7d7
2b02bf2
d14d7d7
2b02bf2
d14d7d7
 
 
 
 
2b02bf2
d14d7d7
2b02bf2
d14d7d7
2b02bf2
d14d7d7
2b02bf2
d14d7d7
 
 
 
 
2b02bf2
d14d7d7
2b02bf2
d14d7d7
 
 
2b02bf2
d14d7d7
2b02bf2
d14d7d7
2b02bf2
d14d7d7
2b02bf2
d14d7d7
 
 
2b02bf2
d14d7d7
2b02bf2
d14d7d7
2b02bf2
d14d7d7
2b02bf2
d14d7d7
2b02bf2
d14d7d7
 
7ee72a9
2b02bf2
92db066
7ee72a9
 
 
2b02bf2
92db066
7ee72a9
2b02bf2
d14d7d7
7ee72a9
2b02bf2
7ee72a9
 
d14d7d7
2b02bf2
d14d7d7
2b02bf2
d14d7d7
 
 
 
2b02bf2
d14d7d7
2b02bf2
d14d7d7
2b02bf2
d14d7d7
2b02bf2
d14d7d7
 
 
2b02bf2
d14d7d7
 
2b02bf2
d14d7d7
2b02bf2
d14d7d7
 
 
 
2b02bf2
d14d7d7
 
2b02bf2
d14d7d7
2b02bf2
d14d7d7
f6b5b49
d14d7d7
f6b5b49
d14d7d7
f6b5b49
92db066
f6b5b49
d14d7d7
 
 
f6b5b49
d14d7d7
f6b5b49
d14d7d7
 
 
 
f6b5b49
d14d7d7
f6b5b49
d14d7d7
f6b5b49
d14d7d7
 
 
f6b5b49
d14d7d7
f6b5b49
d14d7d7
f6b5b49
d14d7d7
f6b5b49
d14d7d7
f6b5b49
d14d7d7
f6b5b49
d14d7d7
f6b5b49
d14d7d7
f6b5b49
d14d7d7

---
base_model: meta-llama/CodeLlama-13b-hf
library_name: peft
pipeline_tag: text-generation
language:
- en
license: apache-2.0
datasets:
- Younis2003/secure_dataset_cvefixes
tags:
- cybersecurity
- vulnerability-detection
- secure-code
- codellama
- lora
- peft
- qlora
- code
---

# CodeLlama_for_code_security

<img src="https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fres.cloudinary.com%2Fmomentum-media-group-pty-ltd%2Fimage%2Fupload%2Fv1693207063%2FCyber%2520Security%2FCode_Llama_csc_mgxscf.jpg&f=1&nofb=1&ipt=3dd5f67c5a258d603fe9b43ceb3ccb4a9fba0e35d17193c1a50dc97e0f3df10c" width="1000"/>
# Overview

CodeLlama_for_code_security is a **LoRA fine-tuned adapter** designed for vulnerability detection and secure code remediation.

The model analyzes vulnerable source code and generates a secure fixed version together with structured vulnerability explanations including CVE and CWE metadata.

The adapter is trained on top of **CodeLlama-13B**.

---

# Model Details

**Developed by:** Younis Alshibli  
**Model type:** LoRA Adapter (PEFT)  
**Base Model:** CodeLlama-13B  
**Language:** English  
**License:** Apache 2.0  

---

# Intended Use

The model is designed for:

- Vulnerability detection
- Secure code remediation
- Security analysis of source code
- Automated security review
- AI-assisted cybersecurity research

Example applications:

- Secure code assistants
- AI vulnerability scanners
- Cybersecurity research tools

---

# Evaluation

The model was evaluated using **semantic similarity between generated fixes and ground truth secure fixes**.

| Metric | Score |
|------|------|
| Embedding Similarity | **0.9643** |

This corresponds to approximately **96% semantic similarity** between predicted outputs and expected secure code fixes.

---

# Prompt Format

The model expects prompts in the following structured format:

```
### System:
You are a security expert. Be concise. Analyze and fix.

### Task:
1. IDENTIFY: One sentence naming the CWE.
2. STRATEGY: One sentence fix strategy.
3. REMEDIATE: Provide the code under '### Fixed Code:'.

### Programming Language:
{language}

### Vulnerable Code:
{vuln_code}

### Analysis:
1. CWE Identification:
```

The model will generate:

* Fixed Code
* Vulnerability Explanation
* CVE metadata
* CWE metadata

---

# How to Use

This model is a **LoRA adapter** and requires the base model.

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = "meta-llama/CodeLlama-13b-hf"
adapter = "Younis2003/CodeLlama_for_code_security"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    device_map="auto"
)

model = PeftModel.from_pretrained(model, adapter)
```

---

# Training Data

The model was trained using the dataset:

**secure_dataset_cvefixes**

# Dataset source:

```
Younis2003/secure_dataset_cvefixes
```

The dataset contains:

* vulnerable code
* fixed code
* CVE descriptions
* CWE classifications

---

# Limitations

* The model may not detect all vulnerabilities.
* Results should always be reviewed by security experts.
* Complex security flaws may require manual analysis.

---

# Ethical Considerations

This model is intended for **defensive cybersecurity research and secure software development**.

It should not be used for malicious activities.

---

# Author

Developed by **Younis Alshibli** as part of an AI research project on:

* AI vulnerability detection
* automated secure code remediation