selfconstruct3d
/

AITSecNER

Model card Files Files and versions

AITSecNER / README.md

selfconstruct3d's picture

selfconstruct3d

Update README.md

834fc6f verified 4 months ago

|

history blame contribute delete

2.37 kB

	---
	datasets:
	- priamai/AnnoCTR
	base_model:
	- urchade/gliner_small-v1
	tags:
	- Security
	- NER
	- CTI
	language:
	- en
	---
	# AITSecNER - Entity Recognition for Cybersecurity

	This repository demonstrates how to use the AITSecNER model hosted on Hugging Face, based on the powerful GLiNER library, to extract cybersecurity-related entities from text.

	## Installation

	Install GLiNER via pip:

	```bash
	pip install gliner
	```

	## Usage

	### Import and Load Model

	Load the pretrained AITSecNER model directly from Hugging Face:

	```python
	from gliner import GLiNER

	model = GLiNER.from_pretrained("selfconstruct3d/AITSecNER", load_tokenizer=True)
	```

	### Predict Entities

	Define the input text and entity labels you wish to extract:

	```python
	# Example input text
	text = """
	Upon opening Emotet maldocs, victims are greeted with fake Microsoft 365 prompt that states
	“THIS DOCUMENT IS PROTECTED,” and instructs victims on how to enable macros.
	"""

	# Entity labels
	labels = [
	'CLICommand/CodeSnippet', 'CON', 'DATE', 'GROUP', 'LOC',
	'MALWARE', 'ORG', 'SECTOR', 'TACTIC', 'TECHNIQUE', 'TOOL'
	]

	# Predict entities
	entities = model.predict_entities(text, labels, threshold=0.5)

	# Display results
	for entity in entities:
	print(f"{entity['text']} => {entity['label']}")
	```

	### Sample Output

	```bash
	Emotet => MALWARE
	Microsoft => ORG
	```

	## Model Details

	The AITSecNER model was fine-tuned using the [urchade/gliner_small](https://huggingface.co/urchade/gliner_small) model from Hugging Face on the [priamai/AnnoCTR dataset](https://huggingface.co/datasets/priamai/AnnoCTR). For more details about the dataset, see the paper ["AnnoCTR: A Dataset for Detecting and Linking Entities, Tactics, and Techniques in Cyber Threat Reports"](https://arxiv.org/abs/2305.10472).

	GLiNER is described in detail in the paper ["GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer"](https://arxiv.org/abs/2311.08526).

	## About

	AITSecNER leverages GLiNER to quickly and accurately extract cybersecurity-specific entities, making it highly suitable for tasks such as:

	- Cyber threat intelligence analysis
	- Incident response documentation
	- Automated cybersecurity reporting



	## Licence
	This model is licensed for non-commercial use only (CC BY-NC 4.0).
	For commercial inquiries, please contact dzenan.hamzic@ait.ac.at.