CodeLlama_for_code_security

# Overview

CodeLlama_for_code_security is a LoRA fine-tuned adapter designed for vulnerability detection and secure code remediation.

The model analyzes vulnerable source code and generates a secure fixed version together with structured vulnerability explanations including CVE and CWE metadata.

The adapter is trained on top of CodeLlama-13B.


Model Details

Developed by: Younis Alshibli
Model type: LoRA Adapter (PEFT)
Base Model: CodeLlama-13B
Language: English
License: Apache 2.0


Intended Use

The model is designed for:

  • Vulnerability detection
  • Secure code remediation
  • Security analysis of source code
  • Automated security review
  • AI-assisted cybersecurity research

Example applications:

  • Secure code assistants
  • AI vulnerability scanners
  • Cybersecurity research tools

Evaluation

The model was evaluated using semantic similarity between generated fixes and ground truth secure fixes.

Metric Score
Embedding Similarity 0.9643

This corresponds to approximately 96% semantic similarity between predicted outputs and expected secure code fixes.


Prompt Format

The model expects prompts in the following structured format:

### System:
You are a security expert. Be concise. Analyze and fix.

### Task:
1. IDENTIFY: One sentence naming the CWE.
2. STRATEGY: One sentence fix strategy.
3. REMEDIATE: Provide the code under '### Fixed Code:'.

### Programming Language:
{language}

### Vulnerable Code:
{vuln_code}

### Analysis:
1. CWE Identification:

The model will generate:

  • Fixed Code
  • Vulnerability Explanation
  • CVE metadata
  • CWE metadata

How to Use

This model is a LoRA adapter and requires the base model.

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = "meta-llama/CodeLlama-13b-hf"
adapter = "Younis2003/CodeLlama_for_code_security"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    device_map="auto"
)

model = PeftModel.from_pretrained(model, adapter)

Training Data

The model was trained using the dataset:

secure_dataset_cvefixes

Dataset source:

Younis2003/secure_dataset_cvefixes

The dataset contains:

  • vulnerable code
  • fixed code
  • CVE descriptions
  • CWE classifications

Limitations

  • The model may not detect all vulnerabilities.
  • Results should always be reviewed by security experts.
  • Complex security flaws may require manual analysis.

Ethical Considerations

This model is intended for defensive cybersecurity research and secure software development.

It should not be used for malicious activities.


Author

Developed by Younis Alshibli as part of an AI research project on:

  • AI vulnerability detection
  • automated secure code remediation
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Younis2003/CodeLlama_for_code_security

Adapter
(1)
this model

Dataset used to train Younis2003/CodeLlama_for_code_security