Spaces:
Running
Running
| title: CodeSentry | |
| emoji: π‘οΈ | |
| colorFrom: indigo | |
| colorTo: purple | |
| sdk: docker | |
| pinned: false | |
| license: mit | |
| app_port: 7860 | |
| # π‘οΈ CodeSentry | |
| > **CodeSentry** is an enterprise-grade, agentic AI security and performance copilot designed to seamlessly analyze codebases, identify critical vulnerabilities, and generate intelligent, ready-to-merge patches β with built-in CUDA β ROCm migration guidance for AMD hardware. | |
| Built with a strict **Zero Data Retention (ZDR)** architecture, CodeSentry ensures that your proprietary code never leaves your secure environment or gets used for model training, making it perfect for highly sensitive, enterprise-scale environments. | |
| --- | |
| ## β¨ Key Features | |
| - **π§ Agentic Pipeline:** CodeSentry uses a multi-agent orchestration architecture: | |
| - **Security Agent:** Combines lightning-fast static analysis with deep semantic LLM reasoning to catch complex vulnerabilities (e.g., prompt injections, hardcoded secrets, unsafe deserialization). | |
| - **Performance Agent:** Specifically tailored to analyze ML/AI logic. It detects GPU memory bottlenecks, inefficient loop structures, and suggests hardware-native optimizations (like `bfloat16` for AMD MI300X). | |
| - **Fix Agent:** Automatically generates unified Git-style diffs and line-by-line patch recommendations for every finding. | |
| - **AMD Migration Advisor:** Scans for 10 categories of CUDA-specific patterns (nvidia-smi, CUDA_VISIBLE_DEVICES, BitsAndBytes, cuDNN, FP16 usage, etc.) and provides actionable ROCm/HIP migration guidance with a 0β100 AMD Compatibility Score. | |
| - **β‘ AMD MI300X Live Metrics:** Real-time GPU performance monitoring (utilization, VRAM, temperature, power draw, inference speed) streamed to the dashboard during every scan via SSE. Uses `rocm-smi` on AMD hardware, with simulated fallback for development environments. | |
| - **π Zero Data Retention (ZDR):** Every analysis session generates a unique cryptographic Privacy Certificate. The backend actively blocks outgoing network calls during the scan and wipes all data from memory the millisecond the scan completes. | |
| - **β‘ Real-Time Streaming:** The analysis engine uses Server-Sent Events (SSE) to stream findings to the frontend instantaneously, creating a highly responsive "live" dashboard experience. | |
| - **π One-Click Reporting:** Export full `SECURITY_REPORT.md` documents, structured JSON audit logs, copy-paste ready GitHub Pull Request descriptions, and `AMD_MIGRATION_GUIDE.md` reports. | |
| --- | |
| ## ποΈ System Architecture | |
| ``` | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β CODESENTRY FRONTEND β | |
| β React + Vite | Cyberpunk Terminal Aesthetic β | |
| β LandingPage β AnalysisView (SSE Live Feed) β ReportView β | |
| β βββββββββββββββββββββ ββββββββββββββββββββββββββ β | |
| β β AMD MI300X Live β β AMD Migration Advisor β β | |
| β β Metrics Card β β Panel + Score Circle β β | |
| β βββββββββββββββββββββ ββββββββββββββββββββββββββ β | |
| βββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ | |
| β SSE (Server-Sent Events) + REST | |
| βββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββ | |
| β CODESENTRY BACKEND β | |
| β FastAPI / Python β | |
| β β | |
| β βββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββββ β | |
| β β Security β β Performance β β Fix Agent β β | |
| β β Agent β β Agent β β (patches + diffs) β β | |
| β ββββββββ¬βββββββ ββββββββββ¬ββββββββββ ββββββββββ¬ββββββββββββ β | |
| β β βββββββββΌβββββββββ β β | |
| β β β AMD Migration β β β | |
| β β β Advisor (10 β β β | |
| β β β CUDA patterns) β β β | |
| β β βββββββββ¬βββββββββ β β | |
| β βββββββββββββββββββΊβββββββββββββββββββββββ β | |
| β ββββββββΌβββββββ β | |
| β β Orchestratorβ β | |
| β ββββββββ¬βββββββ β | |
| β β β | |
| β ββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββ β | |
| β β Privacy Guard β Session Store β AMD Metrics β Code Parser β β | |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β β | |
| β ββββββββΌβββββββ β | |
| β β vLLM Serverβ (Qwen2.5-Coder-32B) β | |
| β βββββββββββββββ β | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| ``` | |
| The project is divided into two main components: | |
| ### 1. The Backend (`/codesentry-backend`) | |
| A high-performance **FastAPI** server that acts as the orchestrator. | |
| - Ingests code via GitHub URLs, Hugging Face Spaces URLs, Zip files, or raw code snippets. | |
| - Manages the stateful analysis session and memory lifecycle. | |
| - Runs **AMD MI300X live metrics polling** via `rocm-smi` (with simulated fallback for dev environments). | |
| - Runs the **AMD Migration Advisor** to detect CUDA-specific patterns and calculate an AMD Compatibility Score. | |
| - Connects to an LLM endpoint (optimized for local deployment via `vLLM` on AMD hardware, using Qwen2.5-Coder-32B) to power the intelligent agents. | |
| ### 2. The Frontend (`/codesentry-frontend`) | |
| A modern **React + Vite** dashboard built with a premium, cyberpunk-inspired terminal aesthetic. | |
| - Connects to the backend via SSE for live streaming. | |
| - Features the **AMD MI300X Live Performance Card** in the Analysis View β 6 GPU metrics updated every 2 seconds. | |
| - Features the **AMD ROCm Migration Advisor Panel** in the Report View β animated score circle, collapsible findings, and one-click `AMD_MIGRATION_GUIDE.md` export. | |
| - Dynamic data visualization, animated severity charts, and side-by-side Before/After code diffing for AI-generated fixes. | |
| --- | |
| ## π΄ AMD-Specific Features | |
| ### Live Hardware Metrics (Analysis View) | |
| During every scan, CodeSentry polls the AMD MI300X GPU via `rocm-smi` and streams live metrics to the dashboard: | |
| | Metric | Description | | |
| |--------|-------------| | |
| | GPU Utilization | Current compute load (%) | | |
| | VRAM Used | GB used / 192 GB total with visual bar | | |
| | Memory Bandwidth | TB/s data throughput | | |
| | Temperature | GPU edge temperature (Β°C) | | |
| | Power Draw | Current wattage consumption (W) | | |
| | Inference Speed | LLM tokens per second | | |
| > On development machines without AMD hardware, the card displays realistic simulated values. | |
| ### CUDA β ROCm Migration Advisor (Report View) | |
| The Migration Advisor scans code for 10 categories of CUDA-specific patterns: | |
| | ID | Severity | What It Detects | | |
| |----|----------|-----------------| | |
| | AMD_M01 | Low | `torch.cuda.is_available()` β CUDA device check | | |
| | AMD_M02 | **Critical** | `nvidia-smi` β NVIDIA-only CLI tool | | |
| | AMD_M03 | High | `CUDA_VISIBLE_DEVICES` β CUDA env variable | | |
| | AMD_M04 | High | `torch.cuda.amp.autocast/GradScaler` β Legacy CUDA AMP | | |
| | AMD_M05 | Medium | `.half()` / `torch.float16` β FP16 suboptimal on MI300X | | |
| | AMD_M06 | Medium | `torch.backends.cudnn.*` β cuDNN configuration | | |
| | AMD_M07 | High | `import flash_attn` β CUDA-only Flash Attention | | |
| | AMD_M08 | Low | `torch.cuda.memory_allocated()` β CUDA memory profiling | | |
| | AMD_M09 | Low | `device = 'cuda'` β Hardcoded device string | | |
| | AMD_M10 | **Critical** | `BitsAndBytesConfig` β CUDA-only quantization | | |
| **Compatibility Scoring:** | |
| ``` | |
| β₯ 90% β "Fully ROCm Ready" (green) | |
| β₯ 70% β "Mostly Compatible" (yellow) | |
| β₯ 50% β "Needs Migration Work" (orange) | |
| < 50% β "CUDA-Specific Codebase" (red) | |
| ``` | |
| --- | |
| ## π‘ How It Works (An Example Workflow) | |
| To understand CodeSentry, imagine you have a Python scraping script that takes user input and feeds it into an LLM. | |
| 1. **Initiate Scan:** You paste the GitHub or Hugging Face Space URL of the script into the CodeSentry dashboard. | |
| 2. **Live GPU Monitoring:** The AMD MI300X Live Performance card immediately starts showing real-time GPU utilization, VRAM usage, temperature, and inference speed. | |
| 3. **Security Sweep:** The Security Agent immediately flags `cli.py:61` for a **Prompt Injection** (CWE-74) vulnerability because it detects raw user input being passed to the model without sanitization. | |
| 4. **Performance Sweep:** The Performance Agent notices the code is loading a large transformer model inside a loop. It flags this and estimates you are wasting significant inference time. | |
| 5. **AMD Migration Scan:** The Migration Advisor detects `nvidia-smi` calls and `CUDA_VISIBLE_DEVICES` usage, calculating an AMD Compatibility Score and suggesting `rocm-smi` and `HIP_VISIBLE_DEVICES` replacements. | |
| 6. **Fix Generation:** The Fix Agent takes these findings and writes a patch. It refactors the prompt injection to use a parameterized template and hoists the model initialization outside the loop. | |
| 7. **Review:** You view the dashboard. The findings are categorized by severity. You click on the Prompt Injection finding, and an AI-Generated Fix panel opens showing exactly what lines to change. The AMD Migration Panel shows your compatibility score with collapsible fix guidance. | |
| 8. **Export:** You click "Copy PR Description" and paste a perfectly formatted summary of the fixes directly into your GitHub Pull Request. You also export the `AMD_MIGRATION_GUIDE.md` for your DevOps team. | |
| --- | |
| ## π Installation & Setup | |
| ### Prerequisites | |
| - Node.js (v20.19+ or v22.12+) | |
| - Python (v3.10+) | |
| - An API Key for your LLM provider (e.g., Groq) if not running a completely local vLLM instance. | |
| ### 1. Backend Setup | |
| Open a terminal and navigate to the backend directory: | |
| ```bash | |
| cd codesentry-backend | |
| # Create and activate a virtual environment | |
| python -m venv venv | |
| # On Windows: | |
| venv\Scripts\activate | |
| # On Mac/Linux: | |
| source venv/bin/activate | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Configure Environment Variables | |
| # Create a .env file based on the example and add your LLM_API_KEY | |
| cp .env.example .env | |
| # Run the backend server | |
| uvicorn main:app --reload --port 8000 | |
| ``` | |
| *The backend will now be running on `http://127.0.0.1:8000`.* | |
| ### 2. Frontend Setup | |
| Open a second terminal and navigate to the frontend directory: | |
| ```bash | |
| cd codesentry-frontend | |
| # Install dependencies | |
| npm install | |
| # Ensure VITE_MOCK_MODE is set to false to connect to the live backend | |
| echo "VITE_MOCK_MODE=false" > .env | |
| # Run the development server | |
| npm run dev | |
| ``` | |
| *The dashboard will be available at `http://127.0.0.1:5173`.* | |
| --- | |
| ## βοΈ Environment Variables | |
| | Variable | Default | Description | | |
| |---|---|---| | |
| | `VLLM_BASE_URL` | `http://localhost:8080/v1` | vLLM OpenAI-compatible endpoint | | |
| | `MODEL_NAME` | `Qwen/Qwen2.5-Coder-32B-Instruct` | Model served by vLLM | | |
| | `USE_LLM` | `true` | Set `false` for static-only mode (CI) | | |
| | `PORT` | `8000` | CodeSentry API port | | |
| | `CORS_ORIGINS` | `*` | Allowed frontend origins | | |
| | `ZDR_SIGNING_KEY` | (dev default) | HMAC key for certificates β **change in production** | | |
| | `GROQ_API_KEY` | β | Groq cloud API key (alternative to local vLLM) | | |
| | `VITE_MOCK_MODE` | `false` | Frontend: use mock data instead of live backend | | |
| | `VITE_API_URL` | `http://localhost:8000` | Frontend: backend base URL | | |
| --- | |
| ## π SSE Event Types | |
| | Event | Description | | |
| |-------|-------------| | |
| | `scan_started` | Scan session created, ID returned | | |
| | `agent_start` | An agent begins (security / performance / fix) | | |
| | `finding` | A security or performance vulnerability found | | |
| | `fix_ready` | A fix patch generated for a specific finding | | |
| | `amd_metrics` | Live AMD MI300X GPU metrics snapshot (every 2s) | | |
| | `amd_migration_finding` | A CUDA β ROCm migration issue detected | | |
| | `amd_migration_summary` | Compatibility score and summary | | |
| | `complete` | Full analysis finished with summary + certificates | | |
| | `error` | An error occurred during analysis | | |
| --- | |
| ## π¦ Export Formats | |
| | Format | Description | | |
| |--------|-------------| | |
| | π **JSON Report** | Machine-readable full report with all findings and fixes | | |
| | π **SECURITY_REPORT.md** | Human-readable markdown security report | | |
| | π **Copy PR Description** | GitHub Pull Request description copied to clipboard | | |
| | π΄ **AMD_MIGRATION_GUIDE.md** | AMD ROCm migration guide with score, findings, and fixes | | |
| --- | |
| ## π Built for the AMD Hackathon | |
| CodeSentry was specifically designed to showcase the power of **Agentic AI** running on high-performance AMD MI300X compute hardware. By combining a suite of specialized agents with real-time GPU monitoring and CUDA β ROCm migration guidance, we shift the paradigm of static code analysis from "reporting problems" to "actively writing solutions." | |
| **Zero Data Retention. 100% Agentic. AMD-Optimized. Enterprise Ready.** | |