OpenAI Privacy Filter MLX 8-bit
This repository contains an 8-bit OpenMed MLX artifact for openai/privacy-filter, packaged for local PII detection on Apple Silicon with OpenMed.
OpenAI Privacy Filter is a bidirectional token-classification model for detecting personally identifiable information in text. This OpenMed MLX build keeps the original BIOES token-label head, uses the o200k_base tokenizer assets, and runs with OpenMed's Python and Swift MLX runtimes.
After the model is downloaded once, inference runs locally. No document text is sent to a server.
Model Details
- Source checkpoint:
openai/privacy-filter - OpenMed MLX family:
openai-privacy-filter - Task: token classification for privacy span detection
- Weight format:
weights.safetensors - Quantization: 8-bit affine quantization, group size 64
- Runtime: OpenMed + MLX on Apple Silicon
- Tokenizer:
o200k_base/ tiktoken-style BPE - Labels:
account_number,private_address,private_date,private_email,private_person,private_phone,private_url,secret
This artifact uses expert-aware MLX quantization: embeddings, attention projections, MoE gates, sparse-MoE expert tensors, and the token-classification head are all stored in 8-bit packed form. The resulting weights.safetensors file is about 1.39 GiB, compared with about 2.61 GiB for the BF16 OpenMed MLX artifact.
Quick Start: Python
pip install -U openmed "openmed[mlx]"
from huggingface_hub import snapshot_download
from openmed.mlx.inference import create_mlx_pipeline
model_path = snapshot_download("OpenMed/privacy-filter-mlx-8bit")
pipe = create_mlx_pipeline(model_path)
text = "My name is Alice Smith and my email is alice.smith@example.com."
entities = pipe(text)
for entity in entities:
print(entity)
Example output:
{
"entity_group": "private_person",
"word": "Alice Smith",
"start": 11,
"end": 22,
"score": 0.9999,
}
{
"entity_group": "private_email",
"word": "alice.smith@example.com",
"start": 39,
"end": 62,
"score": 0.9998,
}
Quick Start: Swift and Apple Apps
Add OpenMedKit to your Xcode project:
- Open Xcode and choose File > Add Package Dependencies.
- Paste
https://github.com/maziyarpanahi/openmed. - Select the
OpenMedKitpackage product. - Download and cache the MLX model once, then run inference locally.
import OpenMedKit
let modelURL = try await OpenMedModelStore.downloadMLXModel(
repoID: "OpenMed/privacy-filter-mlx-8bit"
)
let openmed = try OpenMed(backend: .mlx(modelDirectoryURL: modelURL))
let entities = try openmed.extractPII(
"My name is Alice Smith and my email is alice.smith@example.com."
)
for entity in entities {
print(entity.text, entity.label, entity.score)
}
For iOS, run on Apple Silicon hardware. The iOS Simulator is not the recommended acceptance target for MLX inference.
Validation
The 8-bit artifact was validated against the unquantized OpenMed MLX artifact with fixed text samples. BF16 and Q8 returned identical grouped spans for person, date, phone, email, address, and account-number examples.
OpenMed also includes unit tests for:
- q8 artifact loading
- quantization metadata decoding
- expert tensor packing and
.scalescoverage - finite logits from the q8 runtime
- bf16/q8 shape and argmax-label coherence
- BIOES/Viterbi span decoding
Intended Use
Use this model for local privacy filtering, PII detection, redaction workflows, and evaluation on Apple devices. For high-risk domains such as healthcare, legal, finance, education, and government, evaluate against your own data and policy requirements before production use.
Credits
- Base checkpoint:
openai/privacy-filter - MLX conversion and runtime support: OpenMed
- OpenMed website: https://openmed.life
- Downloads last month
- 115
Quantized
Model tree for OpenMed/privacy-filter-mlx-8bit
Base model
openai/privacy-filter