codeSentry / README.md
YashashviAlva's picture
Initial commit for HF Spaces deploy
7b4f5dd
---
title: CodeSentry
emoji: πŸ›‘οΈ
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false
license: mit
app_port: 7860
---
# πŸ›‘οΈ CodeSentry
> **CodeSentry** is an enterprise-grade, agentic AI security and performance copilot designed to seamlessly analyze codebases, identify critical vulnerabilities, and generate intelligent, ready-to-merge patches β€” with built-in CUDA β†’ ROCm migration guidance for AMD hardware.
Built with a strict **Zero Data Retention (ZDR)** architecture, CodeSentry ensures that your proprietary code never leaves your secure environment or gets used for model training, making it perfect for highly sensitive, enterprise-scale environments.
---
## ✨ Key Features
- **🧠 Agentic Pipeline:** CodeSentry uses a multi-agent orchestration architecture:
- **Security Agent:** Combines lightning-fast static analysis with deep semantic LLM reasoning to catch complex vulnerabilities (e.g., prompt injections, hardcoded secrets, unsafe deserialization).
- **Performance Agent:** Specifically tailored to analyze ML/AI logic. It detects GPU memory bottlenecks, inefficient loop structures, and suggests hardware-native optimizations (like `bfloat16` for AMD MI300X).
- **Fix Agent:** Automatically generates unified Git-style diffs and line-by-line patch recommendations for every finding.
- **AMD Migration Advisor:** Scans for 10 categories of CUDA-specific patterns (nvidia-smi, CUDA_VISIBLE_DEVICES, BitsAndBytes, cuDNN, FP16 usage, etc.) and provides actionable ROCm/HIP migration guidance with a 0–100 AMD Compatibility Score.
- **⚑ AMD MI300X Live Metrics:** Real-time GPU performance monitoring (utilization, VRAM, temperature, power draw, inference speed) streamed to the dashboard during every scan via SSE. Uses `rocm-smi` on AMD hardware, with simulated fallback for development environments.
- **πŸ”’ Zero Data Retention (ZDR):** Every analysis session generates a unique cryptographic Privacy Certificate. The backend actively blocks outgoing network calls during the scan and wipes all data from memory the millisecond the scan completes.
- **⚑ Real-Time Streaming:** The analysis engine uses Server-Sent Events (SSE) to stream findings to the frontend instantaneously, creating a highly responsive "live" dashboard experience.
- **πŸ“‹ One-Click Reporting:** Export full `SECURITY_REPORT.md` documents, structured JSON audit logs, copy-paste ready GitHub Pull Request descriptions, and `AMD_MIGRATION_GUIDE.md` reports.
---
## πŸ—οΈ System Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ CODESENTRY FRONTEND β”‚
β”‚ React + Vite | Cyberpunk Terminal Aesthetic β”‚
β”‚ LandingPage β†’ AnalysisView (SSE Live Feed) β†’ ReportView β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ AMD MI300X Live β”‚ β”‚ AMD Migration Advisor β”‚ β”‚
β”‚ β”‚ Metrics Card β”‚ β”‚ Panel + Score Circle β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ SSE (Server-Sent Events) + REST
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ CODESENTRY BACKEND β”‚
β”‚ FastAPI / Python β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Security β”‚ β”‚ Performance β”‚ β”‚ Fix Agent β”‚ β”‚
β”‚ β”‚ Agent β”‚ β”‚ Agent β”‚ β”‚ (patches + diffs) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚
β”‚ β”‚ β”‚ AMD Migration β”‚ β”‚ β”‚
β”‚ β”‚ β”‚ Advisor (10 β”‚ β”‚ β”‚
β”‚ β”‚ β”‚ CUDA patterns) β”‚ β”‚ β”‚
β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Ίβ”‚β—„β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Orchestratorβ”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Privacy Guard β”‚ Session Store β”‚ AMD Metrics β”‚ Code Parser β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ vLLM Serverβ”‚ (Qwen2.5-Coder-32B) β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
The project is divided into two main components:
### 1. The Backend (`/codesentry-backend`)
A high-performance **FastAPI** server that acts as the orchestrator.
- Ingests code via GitHub URLs, Hugging Face Spaces URLs, Zip files, or raw code snippets.
- Manages the stateful analysis session and memory lifecycle.
- Runs **AMD MI300X live metrics polling** via `rocm-smi` (with simulated fallback for dev environments).
- Runs the **AMD Migration Advisor** to detect CUDA-specific patterns and calculate an AMD Compatibility Score.
- Connects to an LLM endpoint (optimized for local deployment via `vLLM` on AMD hardware, using Qwen2.5-Coder-32B) to power the intelligent agents.
### 2. The Frontend (`/codesentry-frontend`)
A modern **React + Vite** dashboard built with a premium, cyberpunk-inspired terminal aesthetic.
- Connects to the backend via SSE for live streaming.
- Features the **AMD MI300X Live Performance Card** in the Analysis View β€” 6 GPU metrics updated every 2 seconds.
- Features the **AMD ROCm Migration Advisor Panel** in the Report View β€” animated score circle, collapsible findings, and one-click `AMD_MIGRATION_GUIDE.md` export.
- Dynamic data visualization, animated severity charts, and side-by-side Before/After code diffing for AI-generated fixes.
---
## πŸ”΄ AMD-Specific Features
### Live Hardware Metrics (Analysis View)
During every scan, CodeSentry polls the AMD MI300X GPU via `rocm-smi` and streams live metrics to the dashboard:
| Metric | Description |
|--------|-------------|
| GPU Utilization | Current compute load (%) |
| VRAM Used | GB used / 192 GB total with visual bar |
| Memory Bandwidth | TB/s data throughput |
| Temperature | GPU edge temperature (Β°C) |
| Power Draw | Current wattage consumption (W) |
| Inference Speed | LLM tokens per second |
> On development machines without AMD hardware, the card displays realistic simulated values.
### CUDA β†’ ROCm Migration Advisor (Report View)
The Migration Advisor scans code for 10 categories of CUDA-specific patterns:
| ID | Severity | What It Detects |
|----|----------|-----------------|
| AMD_M01 | Low | `torch.cuda.is_available()` β€” CUDA device check |
| AMD_M02 | **Critical** | `nvidia-smi` β€” NVIDIA-only CLI tool |
| AMD_M03 | High | `CUDA_VISIBLE_DEVICES` β€” CUDA env variable |
| AMD_M04 | High | `torch.cuda.amp.autocast/GradScaler` β€” Legacy CUDA AMP |
| AMD_M05 | Medium | `.half()` / `torch.float16` β€” FP16 suboptimal on MI300X |
| AMD_M06 | Medium | `torch.backends.cudnn.*` β€” cuDNN configuration |
| AMD_M07 | High | `import flash_attn` β€” CUDA-only Flash Attention |
| AMD_M08 | Low | `torch.cuda.memory_allocated()` β€” CUDA memory profiling |
| AMD_M09 | Low | `device = 'cuda'` β€” Hardcoded device string |
| AMD_M10 | **Critical** | `BitsAndBytesConfig` β€” CUDA-only quantization |
**Compatibility Scoring:**
```
β‰₯ 90% β†’ "Fully ROCm Ready" (green)
β‰₯ 70% β†’ "Mostly Compatible" (yellow)
β‰₯ 50% β†’ "Needs Migration Work" (orange)
< 50% β†’ "CUDA-Specific Codebase" (red)
```
---
## πŸ’‘ How It Works (An Example Workflow)
To understand CodeSentry, imagine you have a Python scraping script that takes user input and feeds it into an LLM.
1. **Initiate Scan:** You paste the GitHub or Hugging Face Space URL of the script into the CodeSentry dashboard.
2. **Live GPU Monitoring:** The AMD MI300X Live Performance card immediately starts showing real-time GPU utilization, VRAM usage, temperature, and inference speed.
3. **Security Sweep:** The Security Agent immediately flags `cli.py:61` for a **Prompt Injection** (CWE-74) vulnerability because it detects raw user input being passed to the model without sanitization.
4. **Performance Sweep:** The Performance Agent notices the code is loading a large transformer model inside a loop. It flags this and estimates you are wasting significant inference time.
5. **AMD Migration Scan:** The Migration Advisor detects `nvidia-smi` calls and `CUDA_VISIBLE_DEVICES` usage, calculating an AMD Compatibility Score and suggesting `rocm-smi` and `HIP_VISIBLE_DEVICES` replacements.
6. **Fix Generation:** The Fix Agent takes these findings and writes a patch. It refactors the prompt injection to use a parameterized template and hoists the model initialization outside the loop.
7. **Review:** You view the dashboard. The findings are categorized by severity. You click on the Prompt Injection finding, and an AI-Generated Fix panel opens showing exactly what lines to change. The AMD Migration Panel shows your compatibility score with collapsible fix guidance.
8. **Export:** You click "Copy PR Description" and paste a perfectly formatted summary of the fixes directly into your GitHub Pull Request. You also export the `AMD_MIGRATION_GUIDE.md` for your DevOps team.
---
## πŸš€ Installation & Setup
### Prerequisites
- Node.js (v20.19+ or v22.12+)
- Python (v3.10+)
- An API Key for your LLM provider (e.g., Groq) if not running a completely local vLLM instance.
### 1. Backend Setup
Open a terminal and navigate to the backend directory:
```bash
cd codesentry-backend
# Create and activate a virtual environment
python -m venv venv
# On Windows:
venv\Scripts\activate
# On Mac/Linux:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Configure Environment Variables
# Create a .env file based on the example and add your LLM_API_KEY
cp .env.example .env
# Run the backend server
uvicorn main:app --reload --port 8000
```
*The backend will now be running on `http://127.0.0.1:8000`.*
### 2. Frontend Setup
Open a second terminal and navigate to the frontend directory:
```bash
cd codesentry-frontend
# Install dependencies
npm install
# Ensure VITE_MOCK_MODE is set to false to connect to the live backend
echo "VITE_MOCK_MODE=false" > .env
# Run the development server
npm run dev
```
*The dashboard will be available at `http://127.0.0.1:5173`.*
---
## βš™οΈ Environment Variables
| Variable | Default | Description |
|---|---|---|
| `VLLM_BASE_URL` | `http://localhost:8080/v1` | vLLM OpenAI-compatible endpoint |
| `MODEL_NAME` | `Qwen/Qwen2.5-Coder-32B-Instruct` | Model served by vLLM |
| `USE_LLM` | `true` | Set `false` for static-only mode (CI) |
| `PORT` | `8000` | CodeSentry API port |
| `CORS_ORIGINS` | `*` | Allowed frontend origins |
| `ZDR_SIGNING_KEY` | (dev default) | HMAC key for certificates β€” **change in production** |
| `GROQ_API_KEY` | β€” | Groq cloud API key (alternative to local vLLM) |
| `VITE_MOCK_MODE` | `false` | Frontend: use mock data instead of live backend |
| `VITE_API_URL` | `http://localhost:8000` | Frontend: backend base URL |
---
## πŸ“Š SSE Event Types
| Event | Description |
|-------|-------------|
| `scan_started` | Scan session created, ID returned |
| `agent_start` | An agent begins (security / performance / fix) |
| `finding` | A security or performance vulnerability found |
| `fix_ready` | A fix patch generated for a specific finding |
| `amd_metrics` | Live AMD MI300X GPU metrics snapshot (every 2s) |
| `amd_migration_finding` | A CUDA β†’ ROCm migration issue detected |
| `amd_migration_summary` | Compatibility score and summary |
| `complete` | Full analysis finished with summary + certificates |
| `error` | An error occurred during analysis |
---
## πŸ“¦ Export Formats
| Format | Description |
|--------|-------------|
| πŸ“„ **JSON Report** | Machine-readable full report with all findings and fixes |
| πŸ“ **SECURITY_REPORT.md** | Human-readable markdown security report |
| πŸ“‹ **Copy PR Description** | GitHub Pull Request description copied to clipboard |
| πŸ”΄ **AMD_MIGRATION_GUIDE.md** | AMD ROCm migration guide with score, findings, and fixes |
---
## πŸ” Built for the AMD Hackathon
CodeSentry was specifically designed to showcase the power of **Agentic AI** running on high-performance AMD MI300X compute hardware. By combining a suite of specialized agents with real-time GPU monitoring and CUDA β†’ ROCm migration guidance, we shift the paradigm of static code analysis from "reporting problems" to "actively writing solutions."
**Zero Data Retention. 100% Agentic. AMD-Optimized. Enterprise Ready.**