The world is being quietly rearranged by people who write very long documents.


The title they went with Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen! Noisy translates that to

The creator of an open-source AI model can steal your private fine-tuning data


Researchers found that the original creators of open-source AI models can steal the private data used to fine-tune them. This means companies using these models for specific tasks might have their proprietary information extracted by the model's original developer.
Companies have assumed that fine-tuning an open-source AI model with their own data keeps that data private. This paper shows that the original model creator can easily extract that proprietary information, even with limited access to the fine-tuned model. This means the privacy guarantees for using open-source AI models are weaker than many believed.
Watch for open-source AI model licenses to start including explicit clauses about data privacy or for new security standards to emerge for fine-tuning practices.

If you insist
Read the original →