Younis2003
/

CodeLlama_for_code_security

Text Generation

vulnerability-detection

Model card Files Files and versions

CodeLlama_for_code_security / README.md

Younis2003's picture

Update README.md

7ee72a9 verified 9 days ago

|

History Blame Contribute Delete

3.53 kB

	---
	base_model: meta-llama/CodeLlama-13b-hf
	library_name: peft
	pipeline_tag: text-generation
	language:
	- en
	license: apache-2.0
	datasets:
	- Younis2003/secure_dataset_cvefixes
	tags:
	- cybersecurity
	- vulnerability-detection
	- secure-code
	- codellama
	- lora
	- peft
	- qlora
	- code
	---

	# CodeLlama_for_code_security

	<img src="https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fres.cloudinary.com%2Fmomentum-media-group-pty-ltd%2Fimage%2Fupload%2Fv1693207063%2FCyber%2520Security%2FCode_Llama_csc_mgxscf.jpg&f=1&nofb=1&ipt=3dd5f67c5a258d603fe9b43ceb3ccb4a9fba0e35d17193c1a50dc97e0f3df10c" width="1000"/>
	# Overview

	CodeLlama_for_code_security is a LoRA fine-tuned adapter designed for vulnerability detection and secure code remediation.

	The model analyzes vulnerable source code and generates a secure fixed version together with structured vulnerability explanations including CVE and CWE metadata.

	The adapter is trained on top of CodeLlama-13B.

	---

	# Model Details

	Developed by: Younis Alshibli
	Model type: LoRA Adapter (PEFT)
	Base Model: CodeLlama-13B
	Language: English
	License: Apache 2.0

	---

	# Intended Use

	The model is designed for:

	- Vulnerability detection
	- Secure code remediation
	- Security analysis of source code
	- Automated security review
	- AI-assisted cybersecurity research

	Example applications:

	- Secure code assistants
	- AI vulnerability scanners
	- Cybersecurity research tools

	---

	# Evaluation

	The model was evaluated using semantic similarity between generated fixes and ground truth secure fixes.

	\| Metric \| Score \|
	\|------\|------\|
	\| Embedding Similarity \| 0.9643 \|

	This corresponds to approximately 96% semantic similarity between predicted outputs and expected secure code fixes.

	---

	# Prompt Format

	The model expects prompts in the following structured format:

	```
	### System:
	You are a security expert. Be concise. Analyze and fix.

	### Task:
	1. IDENTIFY: One sentence naming the CWE.
	2. STRATEGY: One sentence fix strategy.
	3. REMEDIATE: Provide the code under '### Fixed Code:'.

	### Programming Language:
	{language}

	### Vulnerable Code:
	{vuln_code}

	### Analysis:
	1. CWE Identification:
	```

	The model will generate:

	* Fixed Code
	* Vulnerability Explanation
	* CVE metadata
	* CWE metadata

	---

	# How to Use

	This model is a LoRA adapter and requires the base model.

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	base_model = "meta-llama/CodeLlama-13b-hf"
	adapter = "Younis2003/CodeLlama_for_code_security"

	tokenizer = AutoTokenizer.from_pretrained(base_model)

	model = AutoModelForCausalLM.from_pretrained(
	base_model,
	device_map="auto"
	)

	model = PeftModel.from_pretrained(model, adapter)
	```

	---

	# Training Data

	The model was trained using the dataset:

	secure_dataset_cvefixes

	# Dataset source:

	```
	Younis2003/secure_dataset_cvefixes
	```

	The dataset contains:

	* vulnerable code
	* fixed code
	* CVE descriptions
	* CWE classifications

	---

	# Limitations

	* The model may not detect all vulnerabilities.
	* Results should always be reviewed by security experts.
	* Complex security flaws may require manual analysis.

	---

	# Ethical Considerations

	This model is intended for defensive cybersecurity research and secure software development.

	It should not be used for malicious activities.

	---

	# Author

	Developed by Younis Alshibli as part of an AI research project on:

	* AI vulnerability detection
	* automated secure code remediation