Younis2003's picture
Update README.md
7ee72a9 verified
|
Raw
History Blame Contribute Delete
3.53 kB
---
base_model: meta-llama/CodeLlama-13b-hf
library_name: peft
pipeline_tag: text-generation
language:
- en
license: apache-2.0
datasets:
- Younis2003/secure_dataset_cvefixes
tags:
- cybersecurity
- vulnerability-detection
- secure-code
- codellama
- lora
- peft
- qlora
- code
---
# CodeLlama_for_code_security
<img src="https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fres.cloudinary.com%2Fmomentum-media-group-pty-ltd%2Fimage%2Fupload%2Fv1693207063%2FCyber%2520Security%2FCode_Llama_csc_mgxscf.jpg&f=1&nofb=1&ipt=3dd5f67c5a258d603fe9b43ceb3ccb4a9fba0e35d17193c1a50dc97e0f3df10c" width="1000"/>
# Overview
CodeLlama_for_code_security is a **LoRA fine-tuned adapter** designed for vulnerability detection and secure code remediation.
The model analyzes vulnerable source code and generates a secure fixed version together with structured vulnerability explanations including CVE and CWE metadata.
The adapter is trained on top of **CodeLlama-13B**.
---
# Model Details
**Developed by:** Younis Alshibli
**Model type:** LoRA Adapter (PEFT)
**Base Model:** CodeLlama-13B
**Language:** English
**License:** Apache 2.0
---
# Intended Use
The model is designed for:
- Vulnerability detection
- Secure code remediation
- Security analysis of source code
- Automated security review
- AI-assisted cybersecurity research
Example applications:
- Secure code assistants
- AI vulnerability scanners
- Cybersecurity research tools
---
# Evaluation
The model was evaluated using **semantic similarity between generated fixes and ground truth secure fixes**.
| Metric | Score |
|------|------|
| Embedding Similarity | **0.9643** |
This corresponds to approximately **96% semantic similarity** between predicted outputs and expected secure code fixes.
---
# Prompt Format
The model expects prompts in the following structured format:
```
### System:
You are a security expert. Be concise. Analyze and fix.
### Task:
1. IDENTIFY: One sentence naming the CWE.
2. STRATEGY: One sentence fix strategy.
3. REMEDIATE: Provide the code under '### Fixed Code:'.
### Programming Language:
{language}
### Vulnerable Code:
{vuln_code}
### Analysis:
1. CWE Identification:
```
The model will generate:
* Fixed Code
* Vulnerability Explanation
* CVE metadata
* CWE metadata
---
# How to Use
This model is a **LoRA adapter** and requires the base model.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = "meta-llama/CodeLlama-13b-hf"
adapter = "Younis2003/CodeLlama_for_code_security"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
base_model,
device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter)
```
---
# Training Data
The model was trained using the dataset:
**secure_dataset_cvefixes**
# Dataset source:
```
Younis2003/secure_dataset_cvefixes
```
The dataset contains:
* vulnerable code
* fixed code
* CVE descriptions
* CWE classifications
---
# Limitations
* The model may not detect all vulnerabilities.
* Results should always be reviewed by security experts.
* Complex security flaws may require manual analysis.
---
# Ethical Considerations
This model is intended for **defensive cybersecurity research and secure software development**.
It should not be used for malicious activities.
---
# Author
Developed by **Younis Alshibli** as part of an AI research project on:
* AI vulnerability detection
* automated secure code remediation