| | --- |
| | base_model: |
| | - allenai/longformer-base-4096 |
| | datasets: |
| | - wangkevin02/LMSYS-USP |
| | language: |
| | - en |
| | license: mit |
| | metrics: |
| | - accuracy |
| | pipeline_tag: text-classification |
| | library_name: transformers |
| | --- |
| | |
| | # AI Detect Model |
| |
|
| | ## Model Description |
| |
|
| | > **GitHub repository** for exploring the source code and additional resources: https://github.com/wangkevin02/USP |
| |
|
| | The **AI Detect Model** is a binary classification model designed to determine whether a given text is AI-generated (label=1) or written by a human (label=0). This model plays a crucial role in providing AI detection rewards, helping to prevent reward hacking during Reinforcement Learning with Cycle Consistency (RLCC). For more details, please refer to [our paper](https://arxiv.org/pdf/2502.18968). |
| |
|
| | This model is built upon the [Longformer](https://huggingface.co/allenai/longformer-base-4096) architecture and trained using our proprietary [LMSYS-USP](https://huggingface.co/datasets/wangkevin02/LMSYS-USP) dataset. Specifically, in a dialogue context, texts generated by the assistant are labeled as AI-generated (label=1), while user-generated texts are assigned the opposite label (label=0). |
| |
|
| | > *Note*: Our model is subject to the following constraints: |
| | > |
| | > 1. **Maximum Context Length**: Supports up to **4,096 tokens**. Exceeding this may degrade performance; keep inputs within this limit for best results. |
| | > 2. **Language Limitation**: Optimized for English. Non-English performance may vary due to limited training data. |
| |
|
| |
|
| |
|
| | ## Quick Start |
| |
|
| | You can utilize our AI detection model as demonstrated below: |
| |
|
| | ```python |
| | from transformers import LongformerTokenizer, LongformerForSequenceClassification |
| | import torch |
| | import torch.nn.functional as F |
| | |
| | class AIDetector: |
| | def __init__(self, model_name="allenai/longformer-base-4096", max_length=4096): |
| | """ |
| | Initialize the AIDetector with a pretrained Longformer model and tokenizer. |
| | |
| | Args: |
| | model_name (str): The name or path of the pretrained Longformer model. |
| | max_length (int): The maximum sequence length for tokenization. |
| | """ |
| | self.tokenizer = LongformerTokenizer.from_pretrained(model_name) |
| | self.model = LongformerForSequenceClassification.from_pretrained(model_name) |
| | self.model.eval() |
| | self.max_length = max_length |
| | self.tokenizer.padding_side = "right" |
| | |
| | @torch.no_grad() |
| | def get_probability(self, texts): |
| | inputs = self.tokenizer(texts, padding=True, truncation=True, max_length=self.max_length, return_tensors='pt') |
| | outputs = self.model(**inputs) |
| | probabilities = F.softmax(outputs.logits, dim=1) |
| | return probabilities |
| | |
| | # Example usage |
| | if __name__ == "__main__": |
| | classifier = AIDetector(model_name="/path/to/ai_detector") |
| | target_text = [ |
| | "I am thinking about going away for vacation", |
| | "How can I help you today?" |
| | ] |
| | result = classifier.get_probability(target_text) |
| | print(result) |
| | # >>> Expected Output: |
| | # >>> tensor([[0.9954, 0.0046], |
| | # >>> [0.0265, 0.9735]]) |
| | ``` |
| |
|
| |
|
| |
|
| | ## Citation |
| |
|
| | If you find this model useful, please cite: |
| |
|
| | ```plaintext |
| | @misc{wang2025knowbettermodelinghumanlike, |
| | title={Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles}, |
| | author={Kuang Wang and Xianfei Li and Shenghao Yang and Li Zhou and Feng Jiang and Haizhou Li}, |
| | year={2025}, |
| | eprint={2502.18968}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL}, |
| | url={https://arxiv.org/abs/2502.18968}, |
| | } |
| | ``` |