AetherPrior
/

CWE-Code_Vulnerability_Security_DPO

AetherPrior commited on Feb 16

Commit

afbe1d6

verified ·

1 Parent(s): 3d98807

Add dataset card with CWE labeling details

Files changed (1) hide show

README.md ADDED Viewed

+# CWE-Code_Vulnerability_Security_DPO
+This dataset adds a `cwe_ids` field to each row in
+`CyberNative/Code_Vulnerability_Security_DPO` by querying an LLM
+with the row's vulnerability description and rejected code sample.
+## How the CWE IDs were obtained
+- Model: gpt-5-mini via the OpenAI Responses API
+- Reasoning: low effort
+- Tooling: `web_search` limited to `cwe.mitre.org`
+- Prompt format:
+```
+Given the following description of a CWE and a code segment that replicates it:
+Description: {desc}
+Code: {vuln_code}
+What's the most likely CWE ID associated with this vulnerability? Answer in the following format:
+## Answer:
+CWE-#: CWE Name
+```
+- Parsing: lines starting with `CWE-` are extracted and the numeric ID is retained.