Text Generation
Transformers
Safetensors
English
qwen2
code
code-review
programming
qlora
unsloth
qwen2.5
bug-detection
conversational
text-generation-inference
Instructions to use sriksven/CodeLens-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sriksven/CodeLens-7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="sriksven/CodeLens-7B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("sriksven/CodeLens-7B") model = AutoModelForCausalLM.from_pretrained("sriksven/CodeLens-7B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use sriksven/CodeLens-7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "sriksven/CodeLens-7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sriksven/CodeLens-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/sriksven/CodeLens-7B
- SGLang
How to use sriksven/CodeLens-7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "sriksven/CodeLens-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sriksven/CodeLens-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "sriksven/CodeLens-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sriksven/CodeLens-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use sriksven/CodeLens-7B with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for sriksven/CodeLens-7B to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for sriksven/CodeLens-7B to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for sriksven/CodeLens-7B to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="sriksven/CodeLens-7B", max_seq_length=2048, ) - Docker Model Runner
How to use sriksven/CodeLens-7B with Docker Model Runner:
docker model run hf.co/sriksven/CodeLens-7B
File size: 4,499 Bytes
783c292 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 | ---
license: apache-2.0
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
- code
- code-review
- programming
- qlora
- unsloth
- qwen2.5
- bug-detection
datasets:
- sahil2801/CodeAlpaca-20k
language:
- en
pipeline_tag: text-generation
library_name: transformers
model-index:
- name: CodeLens-7B
results: []
---
# CodeLens-7B
A fine-tuned **Qwen2.5-7B-Instruct** model specialized for **code review, bug detection, and programming assistance**. It analyzes code snippets, identifies issues, suggests improvements, and writes clean solutions across multiple programming languages.
## Key Details
| | |
|---|---|
| **Base model** | Qwen/Qwen2.5-7B-Instruct |
| **Method** | QLoRA (4-bit NF4, rank 16, alpha 16) |
| **Library** | Unsloth + TRL SFTTrainer |
| **Dataset** | sahil2801/CodeAlpaca-20k (10K examples) |
| **Hardware** | NVIDIA RTX A5000 (24GB VRAM) on RunPod |
| **Training time** | ~2.65 hours (500 steps) |
| **Final loss** | 0.450 |
| **Parameters trained** | 40.4M of 7.66B (0.53%) |
| **Format** | ChatML |
| **Output** | Merged 16-bit safetensors |
## Dataset
Trained on 10,000 examples from [sahil2801/CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k), a code instruction-following dataset covering code generation, debugging, explanation, and review tasks across Python, JavaScript, Java, C, SQL, and more.
## Usage
### Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("sriksven/CodeLens-7B")
tokenizer = AutoTokenizer.from_pretrained("sriksven/CodeLens-7B")
messages = [
{
"role": "system",
"content": "You are an expert code reviewer and programmer. Analyze code, find bugs, suggest improvements, and write clean efficient solutions.",
},
{
"role": "user",
"content": "Review this Python function for bugs and improvements:\n\ndef find_duplicates(lst):\n seen = []\n dupes = []\n for i in lst:\n if i in seen:\n dupes.append(i)\n seen.append(i)\n return dupes",
},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Unsloth (faster inference)
```python
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="sriksven/CodeLens-7B",
max_seq_length=2048,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
```
## Capabilities
- **Code review** — analyze code for bugs, anti-patterns, and style issues
- **Bug detection** — identify logical errors, off-by-one mistakes, edge cases
- **Code generation** — write functions, classes, and scripts from descriptions
- **Code explanation** — explain what a piece of code does step by step
- **Refactoring suggestions** — propose cleaner, more efficient alternatives
- **Multi-language** — Python, JavaScript, Java, C/C++, SQL, HTML/CSS, and more
## Intended Use
- Local code review assistant
- Programming tutoring and education
- Code quality tooling in CI/CD pipelines
- Prototyping developer tools with local LLMs
## Limitations
- Trained on instruction-following code data, not real code review conversations from PRs
- May not catch security vulnerabilities that require deep context
- Code suggestions should be tested before use in production
- Best with shorter code snippets (functions/classes) rather than full files
- No execution or testing capability — suggestions are pattern-based
## Training Metrics
Loss decreased steadily from 2.17 to 0.27 over 500 steps (~13 epochs), indicating strong learning on the code instruction data.
| Step | Loss | Epoch |
|---|---|---|
| 10 | 2.168 | 0.26 |
| 100 | 0.503 | 2.05 |
| 250 | 0.430 | 6.41 |
| 400 | 0.310 | 10.26 |
| 500 | 0.278 | 12.83 |
## Training Infrastructure
| | |
|---|---|
| **GPU** | NVIDIA RTX A5000 24GB |
| **Cloud** | RunPod ($0.27/hr) |
| **Framework** | Unsloth 2026.5.2 + TRL + Transformers 5.5.0 |
| **Precision** | BF16 training, 4-bit NF4 base quantization |
| **Optimizer** | AdamW 8-bit |
| **Learning rate** | 2e-4, linear decay |
| **Batch size** | 16 effective (4 per device × 4 accumulation) |
| **Packing** | Enabled |
## Source Code
Training scripts: [github.com/sriksven/LLM-FineTune-Suite](https://github.com/sriksven/LLM-FineTune-Suite)
## License
Apache 2.0 |