Instructions to use phishbot/Isitphish with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use phishbot/Isitphish with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="phishbot/Isitphish")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("phishbot/Isitphish") model = AutoModelForSequenceClassification.from_pretrained("phishbot/Isitphish") - Notebooks
- Google Colab
- Kaggle
| license: unknown | |
| # Overview | |
| <!-- This model is obtained by finetuning Pre-Trained RoBERTa on dataset containing several sets of malicious prompts. | |
| Using this model, we can classify malicious prompts that can lead towards creation of phishing websites and phishing emails. | |
| This model is obtained by finetuning a Pre-Trained RoBERTa using a dataset encompassing multiple sets of malicious prompts, as detailed in the corresponding arXiv paper. | |
| Using this model, we can classify malicious prompts that can lead towards creation of phishing websites and phishing emails. --> | |
| Our model, "Is it Phish?" is designed to identify malicious prompts that can be used to generate phishing websites and emails using popular commercial LLMs like ChatGPT, Bard and Claude. | |
| This model is obtained by finetuning a Pre-Trained RoBERTa using a dataset encompassing multiple sets of malicious prompts. | |
| Try out "Is it Phish?" using the Inference API. Our model classifies prompts with "Label 1" to signify the identification of a phishing attempt, while "Label 0" denotes a prompt that is considered safe and non-malicious. | |
| ## Dataset Details | |
| The dataset utilized for training this model has been created using malicious prompts generated by GPT-4. | |
| Due to ethical concerns, our dataset is currently available only upon request. | |
| ## Training Details | |
| The model was trained using RobertaForSequenceClassification.from_pretrained. | |
| In this process, both the model and tokenizer pertinent to the RoBERTa-base were employed. | |
| We trained this model for 10 epochs, setting a learning rate to 2e-5, and used AdamW Optimizer. | |
| ## Inference | |
| There are multiple ways to use this model. The simplest way to use is with pipeline "text-classification" | |
| ```python | |
| from transformers import pipeline | |
| classifier = pipeline(task="text-classification", model="phishbot/Isitphish", top_k=None) | |
| prompt = ["Your Sample Sentence or Prompt...."] | |
| model_outputs = classifier(prompt) | |
| print(model_outputs[0]) | |
| ``` | |
| ### Results | |
| Achieved an accuracy of 96% with an F1-score of 0.96, on different test sets distribution. | |