wangkevin02
/

AI_Detect_Model

Text Classification

Model card Files Files and versions

AI_Detect_Model / README.md

wangkevin02's picture

Update model card for AI Detect Model (#1)

dc66ed3 verified 12 months ago

|

history blame contribute delete

3.55 kB

	---
	base_model:
	- allenai/longformer-base-4096
	datasets:
	- wangkevin02/LMSYS-USP
	language:
	- en
	license: mit
	metrics:
	- accuracy
	pipeline_tag: text-classification
	library_name: transformers
	---

	# AI Detect Model

	## Model Description

	> GitHub repository for exploring the source code and additional resources: https://github.com/wangkevin02/USP

	The AI Detect Model is a binary classification model designed to determine whether a given text is AI-generated (label=1) or written by a human (label=0). This model plays a crucial role in providing AI detection rewards, helping to prevent reward hacking during Reinforcement Learning with Cycle Consistency (RLCC). For more details, please refer to [our paper](https://arxiv.org/pdf/2502.18968).

	This model is built upon the [Longformer](https://huggingface.co/allenai/longformer-base-4096) architecture and trained using our proprietary [LMSYS-USP](https://huggingface.co/datasets/wangkevin02/LMSYS-USP) dataset. Specifically, in a dialogue context, texts generated by the assistant are labeled as AI-generated (label=1), while user-generated texts are assigned the opposite label (label=0).

	> Note: Our model is subject to the following constraints:
	>
	> 1. Maximum Context Length: Supports up to 4,096 tokens. Exceeding this may degrade performance; keep inputs within this limit for best results.
	> 2. Language Limitation: Optimized for English. Non-English performance may vary due to limited training data.



	## Quick Start

	You can utilize our AI detection model as demonstrated below:

	```python
	from transformers import LongformerTokenizer, LongformerForSequenceClassification
	import torch
	import torch.nn.functional as F

	class AIDetector:
	def __init__(self, model_name="allenai/longformer-base-4096", max_length=4096):
	"""
	Initialize the AIDetector with a pretrained Longformer model and tokenizer.

	Args:
	model_name (str): The name or path of the pretrained Longformer model.
	max_length (int): The maximum sequence length for tokenization.
	"""
	self.tokenizer = LongformerTokenizer.from_pretrained(model_name)
	self.model = LongformerForSequenceClassification.from_pretrained(model_name)
	self.model.eval()
	self.max_length = max_length
	self.tokenizer.padding_side = "right"

	@torch.no_grad()
	def get_probability(self, texts):
	inputs = self.tokenizer(texts, padding=True, truncation=True, max_length=self.max_length, return_tensors='pt')
	outputs = self.model(**inputs)
	probabilities = F.softmax(outputs.logits, dim=1)
	return probabilities

	# Example usage
	if __name__ == "__main__":
	classifier = AIDetector(model_name="/path/to/ai_detector")
	target_text = [
	"I am thinking about going away for vacation",
	"How can I help you today?"
	]
	result = classifier.get_probability(target_text)
	print(result)
	# >>> Expected Output:
	# >>> tensor([[0.9954, 0.0046],
	# >>> [0.0265, 0.9735]])
	```



	## Citation

	If you find this model useful, please cite:

	```plaintext
	@misc{wang2025knowbettermodelinghumanlike,
	title={Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles},
	author={Kuang Wang and Xianfei Li and Shenghao Yang and Li Zhou and Feng Jiang and Haizhou Li},
	year={2025},
	eprint={2502.18968},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2502.18968},
	}
	```