Spaces:
Running
Running
| title: Code Security Risk Analyzer | |
| emoji: π | |
| colorFrom: red | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 6.13.0 | |
| app_file: app.py | |
| pinned: true | |
| license: apache-2.0 | |
| tags: | |
| - security | |
| - vulnerability-detection | |
| - owasp | |
| - cwe | |
| - code-analysis | |
| - static-analysis | |
| short_description: AI-powered code vulnerability detection with OWASP mapping | |
| # π Code Security Risk Analyzer v2 | |
| AI-powered multi-label vulnerability detection across **30 CWE categories** mapped to **OWASP Top 10 2021**. Supports Python, JavaScript, Java, C, C++, PHP, and Go. | |
| ## v2 Improvements | |
| - **Per-class threshold optimization** β each CWE has its own optimal detection threshold (not global 0.3) | |
| - **Temperature-calibrated probabilities** β confidence scores are meaningful (0.8 β 80% true positive rate) | |
| - **CWE-aware fix generation** β fixer model knows *what* vulnerability to fix | |
| - **3.7x larger fixer model** β CodeT5+ 220M (was flan-t5-small 60M) | |
| - **Asymmetric Loss training** β handles 90% safe class imbalance | |
| ## Model Performance | |
| | Model | Metric | Score | | |
| |-------|--------|-------| | |
| | **Classifier** (GraphCodeBERT 125M) | Macro F1 | **0.476** (+311% vs baseline) | | |
| | | Weighted F1 | **0.945** | | |
| | | Safe Detection F1 | **0.982** | | |
| | **Fixer** (CodeT5+ 220M) | BLEU | **81.0** | | |
| | | ROUGE-L | **0.788** | | |
| | | Eval Loss | **0.175** (3.1x better than v1) | | |
| ## Features | |
| - **Detection Model:** [GraphCodeBERT classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier) β 125M params, two-phase training with ASL loss | |
| - **Fix Generator:** [CodeT5+ 220M](https://huggingface.co/ayshajavd/codet5p-vuln-fixer) β CWE-aware input format, beam search generation | |
| - **Structured Reports:** CWE ID, OWASP category, severity score, exploit likelihood, plain English explanation | |
| - **Attack Chain Analysis:** Multi-vulnerability chaining analysis | |
| - **REST API:** JSON endpoint for integration into CI/CD pipelines | |
| ## API Usage | |
| ```python | |
| from gradio_client import Client | |
| client = Client("ayshajavd/code-security-analyzer") | |
| # Get markdown report | |
| report = client.predict(code="your code here", api_name="/analyze") | |
| # Get structured JSON report | |
| json_report = client.predict(code="your code here", api_name="/get_json_report") | |
| ``` | |
| ## Models & Dataset | |
| - [graphcodebert-vuln-classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier) β Multi-label CWE detection | |
| - [codet5p-vuln-fixer](https://huggingface.co/ayshajavd/codet5p-vuln-fixer) β Vulnerability fix generation | |
| - [code-security-vulnerability-dataset](https://huggingface.co/datasets/ayshajavd/code-security-vulnerability-dataset) β 175K labeled samples | |
| ## Training Notebooks | |
| All training code: [vuln-classifier-training-notebooks](https://huggingface.co/ayshajavd/vuln-classifier-training-notebooks) | |