EDPB Adopted Opinion on Personal Data in Developing and Deploying AI Models
25 December 2024
Last week, the European Data Protection Board (“EDPB“) has adopted an opinion on the use of personal data for the development and deployment of AI models.
The opinion, which was requested by the Irish Data Protection Authority, looks at three issues related to privacy and AI:
- When and how AI models can be considered anonymous;
- Whether and how legitimate interest can be used as a legal basis for developing or using AI models; and
- What happens if an AI model is developed using personal data that was processed unlawfully.
Following is a short summary of the key points made by the EDPB regarding each of these issues:
Anonymity
The EDPB opinion provides guidelines on when AI models trained on personal data will be considered anonymous – a matter which should be assessed, based on specific criteria, on a case-by-case basis.
First, the EDPB states that models designed to provide personal data (as output) regarding individuals whose personal data was used to train the model (as input), such as generative models fine-tuned on the voice recordings of an individual to mimic their voice, or any model designed to reply with personal data from the training when prompted for information regarding a specific person, will not be considered anonymous.
However, even if that is not the case, the AI model is not necessarily rendered anonymous, since personal data may still remain “absorbed” in the parameters of the model and may be obtained from it. The EDPB considers that, for an AI model to be considered anonymous, both (i) the likelihood of direct or indirect extraction of personal data regarding individuals whose personal data was used to train the model and (ii) the likelihood of obtaining, intentionally or not, such personal data from queries, by using “means reasonably likely to be used” by the controller or a third party, should be insignificant.
Relying on Legitimate Interest
The EDPB offers guidance for when “legitimate interest” can be used as a legal basis for the development of AI models. To rely on legitimate interest, the AI model developer must conduct a three-step assessment (as further elaborated in a more general context in the EDPB’s guidelines on this legal basis) and establish that the following cumulative conditions are met:
- The pursuit of a legitimate interest by the controller or by a third party – the EDPB recognizes several examples that may constitute a legitimate interest in the context of AI models, such as developing the service of a conversational agent to assist users; developing an AI system to detect fraudulent content or behaviour; and Improving threat detection in an information system.
- The processing is necessary to pursue the legitimate interest (the necessity test) – the developer should consider elements such as the amount of personal data processed and whether it is proportionate to pursue the legitimate interest at stake, as well as whether the same purpose can be achieved through an AI model that does not entail the processing of personal data or through less intrusive means.
- The legitimate interest is not overridden by the interests or fundamental rights and freedoms of the data subjects (the balancing test) – this step consists of identifying and describing the different opposing rights and interests at stake – the interests, fundamental rights, and freedoms of the data subjects on one side, and on the other side, the interests of the controller or a third party. Developers should also consider the impact of the processing on the data subjects, which may be influenced by the nature of the data processed by the models, the context of the processing, the further consequences that the processing may have, and the likelihood of these further consequences materializing. Finally, the data subjects’ reasonable expectations should also be considered, which may arise from various factors, including whether or not the personal data was publicly available, the relationship between the data subject and the controller, the source of the data and method of collection, information provided to the data subject, and more.
Mitigating Measures
When the data subjects’ interests, rights, and freedoms seem to override the legitimate interest being pursued by the controller or a third party, the developer may consider introducing mitigating measures to limit the impact of the processing on these data subjects. The opinion provides a non-exhaustive list of examples for such mitigating measures, including:
- Technical measures such as pseudonymization, data masking, or using fake personal data in the training set.
- Measures that facilitate extended options to exercise individuals’ rights, such as allowing data subjects to remove their data from the training data set before processing starts, an unconditional opt-out option, and an extended right to erasure.
- Transparency measures such as releasing public and easily accessible communications that go beyond the information required in a privacy policy and using alternative forms of informing data subjects.
- Measures relating to web scraping such as avoiding collecting data that poses risks to certain individuals or groups, refraining from scraping sensitive or intrusive data categories, respecting exclusion mechanisms like robots.txt or ai.txt, limiting collection scope, and providing opt-out options for data subjects before collection begins.
The Effect of Unlawful Processing in the Development of an AI Model
The EDPB provides that unless the AI model has been duly anonymized (in accordance with the guidelines mentioned above), each controller, whether the developer of the AI model or another controller deploying an AI model after its development, may be responsible for unlawful processing of personal data in the development of the model.
The liability of the deployer (who was not responsible for the processing in the development phase) will depend, to a large extent, on whether or not they have conducted an appropriate assessment to ascertain that the AI model was not developed by unlawfully processing personal data.
Please do not hesitate to reach out to us with any further questions on this opinion and its practical implications for your development or deployment of Ai models.