naufalso commited on
Commit
a1aaeff
·
verified ·
1 Parent(s): 95b1739

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -9
README.md CHANGED
@@ -10,19 +10,81 @@ datasets:
10
  - naufalso/cybersec-topic-classification-dataset-filtered
11
  ---
12
 
13
- # Model Trained Using AutoTrain
14
 
15
- - Problem type: Text Classification
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- ## Validation Metrics
18
- loss: 0.07552551478147507
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
- f1: 0.9142125480153649
21
 
22
- precision: 0.9272727272727272
 
 
 
23
 
24
- recall: 0.9015151515151515
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
- auc: 0.9906997543262077
27
 
28
- accuracy: 0.9727642276422764
 
 
 
 
 
 
 
 
 
10
  - naufalso/cybersec-topic-classification-dataset-filtered
11
  ---
12
 
13
+ # Model Card: Cybersecurity Text Classifier (ModernBERT-base)
14
 
15
+ <p align="center">
16
+ <b> "RedSage: A Cybersecurity Generalist LLM" (ICLR 2026) </b>
17
+ <br>
18
+ <b>Authors:</b> Naufal Suryanto<sup>1*</sup>, Muzammal Naseer<sup>1</sup>, Pengfei Li<sup>1</sup>, Syed Talal Wasim<sup>2</sup>, Jinhui Yi<sup>2</sup>, Juergen Gall<sup>2</sup>, Paolo Ceravolo<sup>3</sup>, Ernesto Damiani<sup>3</sup>
19
+ <br>
20
+ <sup>1</sup>Khalifa University, <sup>2</sup>University of Bonn, <sup>3</sup>University of Milan
21
+ <br>
22
+ <sup>*</sup>Project Lead
23
+ <br>
24
+ <br>
25
+ <a href="https://openreview.net/forum?id=W4FAenIrQ2"><img src="https://img.shields.io/badge/Paper-OpenReview-B31B1B.svg"></a>
26
+ <a href="https://huggingface.co/RISys-Lab"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-RISys--Lab-orange"></a>
27
+ </p>
28
 
29
+ ---
30
+
31
+ ## Model Details
32
+
33
+ * **Model Type**: Binary text classification model developed for domain-specific content filtering.
34
+ * **Architecture**: Based on **ModernBERT-base**, a bidirectional transformer encoder optimized for efficiency and long-context performance.
35
+ * **Domain**: Cybersecurity vs. Non-Cybersecurity.
36
+ * **Developers**: Researchers affiliated with Khalifa University, the University of Bonn, the Lamarr Institute, and the University of Milan.
37
+ * **License**: Released as part of the open-source RedSage project resources.
38
+
39
+ ## Intended Use
40
+
41
+ * **Primary Use Case**: Identifying cybersecurity-relevant documents within large-scale, unstructured web corpora such as FineWeb.
42
+ * **Application**: Filtering approximately 17.2 trillion tokens from Common Crawl subsets (2013–2024) to curate the 11.7B-token CyberFineWeb corpus.
43
+ * **Intended Users**: Researchers and developers focused on domain continual pretraining for cybersecurity LLMs.
44
+
45
+ ## Training Data
46
+
47
+ * **Source Dataset**: Cybersecurity Topic Classification dataset.
48
+ * **Data Origin**: Labeled samples collected from Reddit, StackExchange, and arXiv, alongside web articles.
49
+ * **Dataset Size**:
50
+ * **Pre-processing**: 9.27M training samples and 459K validation samples.
51
+ * **Post-filtering**: Reduced to 4.62M training samples and 2.46K validation samples after removing very short texts to minimize ambiguity.
52
+ * **Labeling Method**: Derived from forum categories, tags, and keyword metadata rather than LLM-generated annotations.
53
 
54
+ ## Training Procedure
55
 
56
+ * **Optimizer**: Adam optimizer.
57
+ * **Learning Rate**: 2e-5.
58
+ * **Schedule**: 10% warmup ratio over 2 training epochs.
59
+ * **Hardware**: Implementation utilized the ModernBERT-base encoder as the foundation for the binary head.
60
 
61
+ ## Evaluation Results
62
+
63
+ The model was evaluated on a validation set of 2,460 samples derived from web articles, achieving the following metrics:
64
+
65
+ | Metric | Score |
66
+ | :--- | :--- |
67
+ | **Accuracy** | 97.3% |
68
+ | **Precision** | 92.8% |
69
+ | **Recall** | 90.2% |
70
+ | **F1 Score** | 91.4% |
71
+
72
+ ## Limitations & Risks
73
+
74
+ * **Context Sensitivity**: While highly accurate, the model was specifically filtered to exclude very short texts to avoid context ambiguity.
75
+ * **Temporal Bias**: The model identifies cybersecurity content based on trends observed in web data up to late 2024; emerging threats post-2024 may not be represented.
76
+ * **Dual-Use Concerns**: The classifier is designed to identify offensive security technical content, which carries an inherent risk of misuse if applied outside of defensive or educational research.
77
+
78
+ ---
79
 
80
+ ## Citation
81
 
82
+ ```bibtex
83
+ @inproceedings{suryanto2026redsage,
84
+ title={RedSage: A Cybersecurity Generalist {LLM}},
85
+ author={Naufal Suryanto and Muzammal Naseer and Pengfei Li and Syed Talal Wasim and Jinhui Yi and Juergen Gall and Paolo Ceravolo and Ernesto Damiani},
86
+ booktitle={The Fourteenth International Conference on Learning Representations},
87
+ year={2026},
88
+ url={https://openreview.net/forum?id=W4FAenIrQ2}
89
+ }
90
+ ```