Instructions to use nvidia/Privasis-Cleaner-0.6B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use nvidia/Privasis-Cleaner-0.6B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="nvidia/Privasis-Cleaner-0.6B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("nvidia/Privasis-Cleaner-0.6B")
model = AutoModelForCausalLM.from_pretrained("nvidia/Privasis-Cleaner-0.6B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use nvidia/Privasis-Cleaner-0.6B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "nvidia/Privasis-Cleaner-0.6B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nvidia/Privasis-Cleaner-0.6B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/nvidia/Privasis-Cleaner-0.6B

SGLang

How to use nvidia/Privasis-Cleaner-0.6B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "nvidia/Privasis-Cleaner-0.6B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nvidia/Privasis-Cleaner-0.6B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "nvidia/Privasis-Cleaner-0.6B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nvidia/Privasis-Cleaner-0.6B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use nvidia/Privasis-Cleaner-0.6B with Docker Model Runner:
```
docker model run hf.co/nvidia/Privasis-Cleaner-0.6B
```

Privasis-Cleaner-0.6B Overview

Description:

Privasis-Cleaner-0.6B is a lightweight text-sanitization model designed to remove or abstract sensitive information from text according to a user-provided sanitization instruction. Given raw text and an instruction specifying which categories of information to sanitize (e.g., names, dates, locations, identifiers), the model outputs a cleaned and compliant version of the text. The model is built on Qwen3 0.6B Instruct and fine-tuned on 37K instruction–input–output triplets.

This model is for research and non-commercial use only.

License/Terms of Use:

NVIDIA License (Non-Commercial)

Deployment Geography:

Global

Use Case:

Data engineers, ML practitioners, and organizations handling sensitive text for automatic redaction of PII/PHI, preprocessing for privacy-preserving research, content sanitization, and compliance pipelines (GDPR, HIPAA, etc.)

Release Date:

Github: June 29th
HuggingFace: June 29th

Reference(s):

Privasis: Synthesizing the Largest “Public” Private Dataset from Scratch

Model Architecture:

Architecture Type: Decoder-only Transformer with attention mechanisms, built on Qwen3 0.6B model
Number of model parameters: 0.6B The model utilizes supervised fine-tuning (SFT) with a base of Qwen3 0.6B, optimized for text sanitization via user-specified instruction.

Input:

Input Type(s): Text
Input Format(s): String
Input Parameters: 1D Sequence Other Properties Related to Input: Text input, up to 262,144 tokens (including restrictions).

Output:

Output Type(s): Text
Output Format: "String"
Output Parameters: 1D
Other Properties Related to Output: None Applicable

Software Integration:

Runtime Engine(s): Privasis-Cleaner-0.6B Supported Hardware Microarchitecture Compatibility: NVIDIA H100-80GB, NVIDIA A100 [Preferred/Supported] Operating System(s): Linux

The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.

How to Use:

Privasis-Cleaner takes a sanitization instruction (which categories of information to remove or abstract) together with the raw text, and returns the sanitized text. The model is prompted with a single user turn in the following format (matching the Privasis benchmark code):

**Sanitization Instruction:**
{instruction}
Do not output any explanation or other comment than the sanitized text.

**Text to sanitize:**
{text}

**Sanitized Text:**

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "nvidia/Privasis-Cleaner-0.6B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")

instruction = "Remove all person names, exact dates, and exact locations."
text = "On March 3, 2021, Jane Doe visited the clinic in Boston for a follow-up."

prompt = (
    f"**Sanitization Instruction:**\n{instruction}\n"
    "Do not output any explanation or other comment than the sanitized text.\n\n"
    f"**Text to sanitize:**\n{text}\n\n"
    "**Sanitized Text:**"
)

inputs = tokenizer.apply_chat_template(
    [{"role": "user", "content": prompt}],
    add_generation_prompt=True,
    enable_thinking=False,  # emit the sanitized text directly
    return_tensors="pt",
).to(model.device)

output = model.generate(inputs, max_new_tokens=4096, do_sample=False)
response = tokenizer.decode(output[0][inputs.shape[-1]:], skip_special_tokens=True)

# The model may echo the "Sanitized Text:" header — strip it if present
if "Sanitized Text:" in response:
    response = response.split("Sanitized Text:")[-1]
print(response.strip())

vLLM (OpenAI-compatible server)

Serve the model:

vllm serve nvidia/Privasis-Cleaner-0.6B --port 8000

Then query it:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")

instruction = "Remove all person names, exact dates, and exact locations."
text = "On March 3, 2021, Jane Doe visited the clinic in Boston for a follow-up."

prompt = (
    f"**Sanitization Instruction:**\n{instruction}\n"
    "Do not output any explanation or other comment than the sanitized text.\n\n"
    f"**Text to sanitize:**\n{text}\n\n"
    "**Sanitized Text:**"
)

resp = client.chat.completions.create(
    model="nvidia/Privasis-Cleaner-0.6B",
    messages=[{"role": "user", "content": prompt}],
    temperature=0.0,
    max_tokens=4096,
)
print(resp.choices[0].message.content.strip())

Check out the Privasis benchmark for evaluation.

Model Version(s):

Privasis-Cleaner-0.6B

(Optional) The Privasis-Cleaner-0.6B model can be integrated into an AI system via API calls, accepting natural-language instructions and raw text as input, and returning sanitized text as output, suitable for data pipelines requiring automated text sanitization.

Training, Testing, and Evaluation Datasets:

Training Dataset:

Link: Not Specified

Data Modality: Text

Audio Training Data Size (If Applicable): Not Applicable

Image Training Data Size (If Applicable): Not Applicable

Text Training Data Size (If Applicable): Less than a Billion Tokens

Video Training Data Size (If Applicable): Not Applicable

Non-Audio, Image, Text Training Data Size (If Applicable): Not Applicable

Data Collection Method by dataset: Synthetic

Labeling Method by dataset: Synthetic

Properties (Quantity, Dataset Descriptions, Sensor(s)): 36,723 text-based triplets (text, sanitization instruction, sanitized text); Non-sensitive public and internally generated synthetic text; No personal data, copyright-protected, or IoT/synthetic data mentioned; Linguistic characteristics not specified; No specific sensor type mentioned

Dataset License(s): Governing term is CC-BY-NC, but each subset follows the generator models' original license.

Testing Dataset:

Link: Not Specified

Data Collection Method by dataset: Synthetic

Labeling Method by dataset: Synthetic

Properties (Quantity, Dataset Descriptions, Sensor(s)): 3,041 text-based triplets (text, sanitization instruction, sanitized text); Non-sensitive public and internally generated synthetic text; No personal data, copyright-protected, or IoT/synthetic data mentioned; Linguistic characteristics not specified; No specific sensor type mentioned

Dataset License(s): Governing term is CC-BY-NC, but each subset follows the generator models' original license.

Inference:

Acceleration Engine: vLLM

Test Hardware: GPU (NVIDIA H100)

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards None.

Please report model quality, risk, security vulnerabilities or concerns https://qwen3.ai/support/report.

Generated by NVIDIA Model Card Generator Toolkit.