Michael Anthony PRO
MikeDoes
AI & ML interests
Privacy, Large Language Model, Explainable
Recent Activity
posted
an
update
about 9 hours ago
Are you sure the open-source model you just downloaded is safe?
A recent paper on "Privacy Backdoors" reports a new vulnerability where pre-trained models can be poisoned before fine-tuning them. This is a serious challenge for everyone building on open-source AI.
Instead of just pointing out problems, we believe in finding better solutions. To understand this threat, the researchers needed to test their attack on realistic data structures. They needed a dataset that could effectively simulate a high-stakes privacy attack, and we're proud that our Ai4Privacy dataset was used to provide this crucial benchmark. The paper reports that for our complex dataset, the privacy leakage on a non-poisoned model was almost zero. After the backdoor attack, that number reportedly jumped to 87%.
Ai4Privacy dataset provided a realistic benchmark for their research. Our dataset, composed of synthetic identities, helped them demonstrate how a poisoned model could dramatically amplify privacy leakage.
This is why we champion open source: it enables the community to identify these issues and develop better, safer solutions together.
Kudos to the authors Yuxin Wen, Leo Marchyok, Sanghyun Hong, Jonas Geiping, Tom Goldstein, and Nicholas Carlini, University of Maryland and Google DeepMind.
🔗 Read the research to understand this new challenge: https://arxiv.org/pdf/2404.01231
🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/
reacted
to
their
post
with 👀
about 20 hours ago
A single lock on a door isn't enough. Real security is about layers.
The same is true for AI privacy. A new paper, "Whispered Tuning", offers a fantastic layered solution that aims to fortify LLMs against privacy infringements.
We're proud that the first, essential layer, a high-precision PII redaction model was built on the foundation of the Ai4Privacy/pii-65k dataset.
Our dataset provided the necessary training material for their initial anonymization step, which then enabled them to develop further innovations like differential privacy fine-tuning and output filtering. This is a win-win: our data helps create a solid base, and researchers build powerful, multi-stage privacy architectures on top of it.
Together, we're making AI safer.
🔗 Read the full paper to see how a strong foundation enables a complete privacy solution: https://www.scirp.org/journal/paperinformation?paperid=130659
🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/
reacted
to
their
post
with 🧠
about 20 hours ago
A single lock on a door isn't enough. Real security is about layers.
The same is true for AI privacy. A new paper, "Whispered Tuning", offers a fantastic layered solution that aims to fortify LLMs against privacy infringements.
We're proud that the first, essential layer, a high-precision PII redaction model was built on the foundation of the Ai4Privacy/pii-65k dataset.
Our dataset provided the necessary training material for their initial anonymization step, which then enabled them to develop further innovations like differential privacy fine-tuning and output filtering. This is a win-win: our data helps create a solid base, and researchers build powerful, multi-stage privacy architectures on top of it.
Together, we're making AI safer.
🔗 Read the full paper to see how a strong foundation enables a complete privacy solution: https://www.scirp.org/journal/paperinformation?paperid=130659
🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/