The creator of an open-source AI model can steal your private fine-tuning data

What happened

Researchers found that the original creators of open-source AI models can steal the private data used to fine-tune them. This means companies using these models for specific tasks might have their proprietary information extracted by the model's original developer.

Why it matters

Companies have assumed that fine-tuning an open-source AI model with their own data keeps that data private. This paper shows that the original model creator can easily extract that proprietary information, even with limited access to the fine-tuned model. This means the privacy guarantees for using open-source AI models are weaker than many believed.

The signal

Watch for open-source AI model licenses to start including explicit clauses about data privacy or for new security standards to emerge for fine-tuning practices.