| --- |
| language: en |
| library_name: pytorch |
| license: mit |
| pipeline_tag: text-classification |
| tags: |
| - pytorch |
| - multitask |
| - ai-detection |
| --- |
| |
| # SuaveAI Detection Multitask Model V1 |
|
|
| This repository contains a custom PyTorch multitask model checkpoint and auxiliary files. |
|
|
| The notebook used to train this model is here: https://www.kaggle.com/code/julienserbanescu/suaveai |
|
|
| ## Files |
|
|
| - `multitask_model.pth`: model checkpoint weights |
| - `label_encoder.pkl`: label encoder used to map predictions to labels |
| - `tok.txt`: tokenizer/vocabulary artifact used during preprocessing |
|
|
| ## Important |
|
|
| This is a **custom PyTorch checkpoint** and is not a native Transformers `AutoModel` package. |
| This repo now includes Hugging Face custom-code files so it can be loaded from Hub with |
| `trust_remote_code=True`. |
|
|
| ## Load from Hugging Face Hub |
|
|
| ```python |
| import torch |
| from transformers import AutoModel, AutoTokenizer |
| |
| repo_id = "DaJulster/SuaveAI-Dectection-Multitask-Model-V1" |
| |
| tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True) |
| model = AutoModel.from_pretrained(repo_id, trust_remote_code=True) |
| model.eval() |
| |
| text = "This is a sample input" |
| inputs = tokenizer(text, return_tensors="pt", truncation=True) |
| with torch.no_grad(): |
| outputs = model(**inputs) |
| |
| binary_logits = outputs.logits_binary |
| multiclass_logits = outputs.logits_multiclass |
| ``` |
|
|
| Binary prediction uses `logits_binary`, and AI-model classification uses `logits_multiclass`. |
|
|
| ## Quick start |
|
|
| ```python |
| import torch |
| import pickle |
| |
| # 1) Recreate your model class exactly as in training |
| # from model_def import MultiTaskModel |
| # model = MultiTaskModel(...) |
| |
| model = ... # instantiate your model architecture |
| state = torch.load("multitask_model.pth", map_location="cpu") |
| model.load_state_dict(state) |
| model.eval() |
| |
| with open("label_encoder.pkl", "rb") as f: |
| label_encoder = pickle.load(f) |
| |
| with open("tok.txt", "r", encoding="utf-8") as f: |
| tokenizer_artifact = f.read() |
| |
| # Run your preprocessing + inference pipeline here |
| ``` |
|
|
| ## Intended use |
|
|
| - Multitask AI detection inference in your custom pipeline. |
|
|
| ## Limitations |
|
|
| - Requires matching model definition and preprocessing pipeline. |
| - Not plug-and-play with `transformers.AutoModel.from_pretrained`. |
|
|