A growing number of companies are using AI chatbots to provide customers with quick and uncomplicated support with product questions. This raises the question of how such a system can be operated in compliance with data protection regulations and what requirements arise from the AI Regulation (KI-VO). A typical application scenario is an AI-enhanced FAQ chatbot, for example, which is obtained from appropriately specialised software providers to answer frequently asked questions on a website automatically and, improved by AI, in an increasingly targeted manner.
I. Data protection regulations
1. Are “FAQ bots” processing personal data at all?
The data processed in the course of using the AI chatbot will generally be IP addresses, session cookies and any information entered into the chat. Although personally identifiable information is not collected by users in the first place if possible (keyword: “imposed data”) or deleted again as quickly as possible, as soon as it is nevertheless entered, processing within the meaning of Art. 4 No. 2 GDPR is to be assumed, because the personal assignment is not only theoretically possible, but results from the fact that the person concerned enters their data into the chat mask themselves. The request not to disclose any personal information considerably reduces the probability that personal data will be collected at all. However, complete avoidance is not guaranteed, which is why the existence of personal data must always be assumed in the event of actual input.
Even the technical collection, short-term storage or other use for generating chat responses constitutes a processing operation that is relevant under data protection law. Whether a separate legal basis is required for this is assessed differently in the literature under the heading of “imposed data”. In this case, it is obvious to reject processing within the meaning of Art. 2 para. 1, 4 no. 2 GDPR by means of teleological reduction or – if a voluntary element of the concept of processing is negated – not to assume a “decision” on the purposes and means of processing that establishes responsibility within the meaning of Art. 4 no. 7 GDPR. According to the prevailing view, no separate legal basis is necessary in this respect.
However, the processing of IP addresses and session cookies must always be justified under data protection law.
2. Art. 22 GDPR: The ECJ, AI and the GDPR
With regard to Art. 22 GDPR, the question arises as to whether such AI chatbots make a “decision based solely on automated processing, including profiling” within the meaning of Art. 22 para. 1 GDPR, which produces legal effects concerning the user or similarly significantly affects the user. The ECJ stated in Case C-634/21 (judgment of 7 December 2023, “SCHUFA Holding (Scoring)”) that such an automated decision does not exist simply because a model based on personal data is calculated purely by machine. The protective effect of Art. 22 para. 1 GDPR only applies if this automated determination has a legal effect on the data subject or significantly affects them in a similar way.
However, if the AI chatbot is only used to answer general questions or to provide assistance in the context of support requests, although automated processing is technically carried out for this purpose, neither the purpose of the system nor its functionality provides for the chatbot to decide on significant legal consequences for users. A legal effect or a significant impairment of the users within the meaning of Art. 22 GDPR therefore does not occur.
3. Legitimisation according to GDPR
a) Necessity for contract fulfilment only for customers or in the case of clear initiation
Art. 6 para. 1 sentence 1 lit. b GDPR applies (only) if the processing is necessary for the performance of a contract or for the fulfilment of pre-contractual measures taken at the request of the data subject. The prerequisite is therefore that a specific contractual relationship exists between the parties or is to be initiated directly and that the respective personal data is closely related to the intended provision of services.
In the context of AI chatbots for answering questions about products and/or services, a contractual relationship can be assumed if the chatbot is used as a necessary part of an existing or planned contractual relationship, for example if customers require support services and this support is provided via chatbot. If, on the other hand, only a general information service is provided without a contractual relationship between the user and the respective company, there is no necessity as required by Art. 6 para. 1 lit. b GDPR. The standard can therefore only apply if the chatbot service directly serves to provide support within the framework of existing contractual relationships or to prepare the conclusion of a contract and a contractual service would not be possible without the inclusion of personal data.
b) Legitimate interests of the website operator
As a rule, an appeal to legitimate business interests and the utilisation of Art. 6 para. 1 sentence 1 lit. f GDPR as a legal basis can be considered.
According to Art. 5 para. 1 lit. b GDPR, the purpose must be defined and made clearly recognisable for the data subjects. In the context of the AI chatbot, it must be sufficiently clear to users that their information is being processed for the operation of the chat service. “Blanket formulas” are not sufficient; brief, clearly understandable information that the chat interaction and, if applicable, the further system logic (e.g. for training in anonymised form) will be used is required. At the same time, the role of any anonymisation procedures used to address imposed data must be made transparent in order to clarify the scope of the data processing to the data subjects.
aa) Improved service is a legitimate goal!
The legitimate interest within the meaning of Art. 6 para. 1 sentence 1 lit. f GDPR in the case of the “FAQ AI chatbots” considered here is to improve product communication and to offer users convenient, automated assistance. This additional channel also serves to increase efficiency, as simple and frequently asked questions no longer have to be answered exclusively by human support staff. The aim of improving the accuracy of the answers and thus the overall service is also a legitimate interest within the meaning of the regulation.
bb) Necessity
The criterion of necessity requires that the data processing is suitable for achieving the legitimate purpose and is limited to what is necessary; there must be no milder means that would be equally effective and less intrusive. In the case of the AI chatbot, data processing is aimed at providing users with automated service information and – for technical reasons – collecting a limited amount of personal data in the process.
Without this basic information, the chatbot could neither assign the respective user enquiry nor display a relevant response in the same session. The use of a session ID is therefore necessary in order to be able to continuously respond to user input. This also represents a milder means compared to extensive user registration or permanent account creation, as no further-reaching profiling is carried out, but only a temporary assignment (session) is made.
With regard to any “imposed data” that is entered unintentionally or contrary to the instructions (name, date of birth, etc.), immediate anonymisation is a measure that particularly safeguards interests. This also means that personal data is processed for chat management in the short term, but only to the extent necessary to provide the chat function. An alternative procedure without any possibility of knowing this data is practically impossible to realise, as a chat – by definition – processes user input.
cc) Consideration: It depends on the specific implementation
The actual lawfulness of processing in accordance with Art. 6 para. 1 sentence 1 lit. f GDPR depends on a careful balancing of interests: The legitimate interest of the controller must be weighed against the need for protection of the data subject. In particular, this balancing of interests is based on whether the AI chatbot disproportionately impairs the privacy of users or whether the service interest outweighs this.
In light of recital (47) GDPR, it must be taken into account whether users could “reasonably” expect this type of data processing. Anyone calling up a chatbot typically expects that technical identification features (such as session data, IP address) will be used in the short term to display an individualised response. Unlike tracking across multiple platforms or comprehensive user profiling, this does not result in a scenario in which users become “transparent customers”.
If the company using the chatbot also fulfils its obligations under Art. 13, 14 GDPR by providing information on data protection and a disclaimer before the chat starts, explaining which data is generally collected and that personal information should be avoided wherever possible, this also shapes the reasonable expectations of users and is more likely to lead to this processing being perceived as adequate support – and not as unauthorised research.
In addition, this form of interaction not only leads to improved communication with customers and interested parties, but also contributes to a reduction in the conventional support effort and is therefore an expression of the entrepreneurial freedom protected by Art. 16 CFR. The freedom to choose an occupation (Art. 15 para. 1 CFR) also guarantees that companies may choose business and technological strategies in order to offer their services efficiently.
This is offset by the fundamental rights of users to the protection of their personal data under Art. 8 CFR and to the confidentiality of their communications. However, if the IP address and session data are only entered into the AI system in pseudonymised form and any imposed data is anonymised immediately, the level of intrusion remains comparatively low. Limited data collection through comprehensive protection mechanisms reduces the risk of a deeper intrusion into privacy. If users are also clearly advised at the start of the chat not to enter any personal information, this preserves self-determination and further reduces the severity of the intrusion.
Other factors must also be taken into consideration, such as the nature of the data subjects, who are typically website visitors on a voluntary basis, the sphere that is being interfered with, possible obligations of the controller to provide the service and the comprehensive data security measures. Freedom of expression under Art. 11 para. 1 GDPR also plays a role insofar as users are free to decide which questions they ask; there is naturally no obligation to use a chatbot.
If this is guaranteed, the responsible company’s interest in automated and effective customer service outweighs the comparatively minor adverse effects for users associated with the very limited data collection. Compliance with the principles of data protection law through data minimisation, privacy by design and the provision of mandatory information in accordance with Art. 13, 14 GDPR ensures that the right to the protection of personal data under Art. 8 GDPR is impaired as little as possible. The processing is therefore justified in accordance with Art. 6 para. 1 sentence 1 lit. f GDPR because the legitimate interest of the controller outweighs the interest of the user in a comprehensive exclusion of data processing in this context.
4. Requirements of the AI Regulation
a) Categorisation of the AI chatbot as an AI system
The first prerequisite for the application of the AI Regulation to the chatbots in question here is that they are an AI system within the meaning of the AI Regulation. According to the legal definition within the meaning of Art. 3 No. 1 of the AI Regulation, an AI system is an
“machine-based system that is designed to operate with varying degrees of autonomy, that can demonstrate adaptability after deployment, and that derives from the input it receives for explicit or implicit goals how it can produce results such as predictions, content, recommendations or decisions that can influence physical or virtual environments.”
AI chatbots are unquestionably machine-assisted in this sense. The AI used is also a system that does not follow patterns or defined rules for the automatic execution of operations set by natural persons alone (see recital 12 of the AI Regulation), but rather acts independently to a certain extent with regard to communication with users on the basis of the stored databases and is able to work without human intervention (see also section 5.13 of ISO/IEC 22989:2022 for the corresponding definition and the possible scope of autonomy). When using the aforementioned AI models, the applications based on them regularly have a sufficient degree of autonomy within the meaning of Art. 3 No. 1 AI Regulation.
The system used will also regularly have the adaptability necessary for the applicability of the AI Regulation, which goes beyond simple data processing to enable its own learning, reasoning and modelling processes based on appropriate training. The use of the AI chatbot is aimed precisely at these capabilities, which, unlike a mere navigation chat on the website, should be characterised precisely by improving the accuracy of the answers provided with an increasing number of queries in order to increase user satisfaction as a result.
b) Risk classification according to the AI Regulation
The systems on which such chatbots are based are generally not a system that poses a threat to fundamental rights, meaning that Chapter 2 of the AI Regulation does not apply. Unlike AI systems in the medical sector that create diagnoses or make direct therapeutic decisions, “AI FAQ chatbots” do not provide automated decisions with serious consequences for users. A scenario relevant for categorisation as high-risk AI therefore does not exist, as merely answering general questions about products or support requests does not pose any significant risks to life, limb, fundamental rights or social participation. Consequently, there is no classification in the catalogue of prohibited AI practices (Art. 5 of the AI Regulation) or in the areas listed in Art. 6 et seq. Of the AI Regulation. Rather, it is a moderate form of application (compared to the relevant high-risk categories), in which at most a limited risk within the meaning of the AI Regulation can occur.
Rather, the chatbot interacts directly with end users in order to answer their questions automatically. According to Art. 50 para. 1 of the AI Regulation, it therefore falls under those AI systems that, although they present potential risks to the rights and freedoms of data subjects (in particular with regard to transparency), do not have a far-reaching decision-making function or a highly intrusive effect. Its character as a chatbot for customer communication therefore means that it is categorised as an “AI system with limited risk”.
c) Supporting function for standard processing?
Depending on the design, the exception in Art. 50 para. 2 sentence 3 of the AI Regulation may apply. According to this, providers of AI systems with limited risk are not subject to the standardised transparency obligations if the AI system only performs a supporting function “for standard processing” or if the input data provided by the operator or its semantics are not significantly changed.
Recital 133 p. 7 of the AI Regulation refers to such exceptions in the AI Regulation. According to this, they should be provided for in order to maintain proportionality. The reference points here are sentences 1 to 3 of recital 133 of the AI Regulation:
“A large number of AI systems can generate large amounts of synthetic content where it is increasingly difficult for humans to distinguish between human-generated and authentic content. The wide availability and increasing capabilities of these systems have a significant impact on the integrity of the information ecosystem and the trust placed in it by creating new risks of misinformation and large-scale manipulation, fraud, identity fraud and consumer deception. Given this impact, the rapid pace of technology and the need for new methods and techniques to trace the origin of information, providers of these systems should be required to integrate technical solutions that enable labelling in a machine-readable format and the determination that the output was generated or manipulated by an AI system and not by a human.”
Based on this, the transparency obligations of Art. 50 of the AI Regulation should primarily apply to synthetically generated audio, image, video or text content that cannot be distinguished from human-generated content and can therefore potentially lead to misinformation and manipulation on a large scale, fraud, identity fraud and deception of consumers. Examples include deepfakes, i.e. realistic-looking media content (photo, audio, video) that has been modified, generated or falsified using artificial intelligence techniques.
At the other end of the conceivable risk scale, Art. 50 para. 2 sentence 3 of the AI Regulation considers content that has merely been generated by a supporting function for standard processing or that has not been significantly altered compared to the input data in order to comply with the principle of proportionality. Examples include text additions or simple image editing functions. From a proportionality perspective, the exception for standardised editing of content is intended to take account of the lower need for protection in cases where the AI system only marginally modifies the authentic content.
If these considerations are applied to AI chatbots with a service and FAQ function by accessing specific, previously approved sources of information, the output is essentially limited to retrieving and summarising existing content and presenting it to users in natural language. The semantics of the original information is then not significantly altered: the chatbot essentially reproduces what is already available in public documents or released articles, albeit in a slightly “neutralised” or summarised language form.
In this case, the chatbot predominantly fulfils a supporting function for standard processing within the meaning of Art. 50 para. 2 sentence 3 of the AI Regulation, namely answering common questions about the respective products or services without independently generating new, synthetic content that could lead to relevant deception or falsification. Based on recital 133 of the AI Regulation, according to which the labelling obligation relates specifically to the risk of large-scale manipulation or misinformation, the typical deepfake potential is also lacking. The chatbot does not create realistic sound recordings, videos or image content, but provides text-based answer suggestions that are closely linked to the FAQ information provided.
d) Be careful when implementing an AI system with a general purpose
However, transparency obligations may arise if the AI chatbot is to be regarded as an AI system with a general purpose. According to Art. 3 No. 66 of the AI Regulation, an AI system with a general purpose is
“an AI system that is based on a general-purpose AI model and is capable of serving a variety of purposes both for direct use and for integration into other systems”.
With regard to the integration of general-purpose systems, recital 100 of the AI Regulation explains further:
“Where a general-purpose AI model is integrated into or forms part of an AI system, that system should be considered to be a general-purpose AI system if that integration enables that system to serve a variety of purposes”.
If the system used in practice is based on an AI model that offers a variety of functions (such as ChatGPT, which has been trained using large amounts of language and knowledge data) and this model is integrated into the chatbot unchanged or only slightly adapted, the question arises as to whether this results in the ability to cover a variety of purposes within the meaning of Art. 3 No. 63 of the AI Regulation. According to the AI Regulation, an AI system is only considered to have a general purpose if the resulting system does not merely fulfil a specific, narrowly defined function, but can also be used for numerous other applications or objectives.
If the chatbot used only serves to answer FAQs and remains limited to this narrow purpose through appropriate configuration, the better arguments speak in favour of there being no comprehensive multi-purpose capability. The underlying model may be generally usable, but the specific chatbot application is limited to a single domain. Therefore, the system itself cannot be categorised as a general AI system, but merely as a specific AI system that uses a general model in the background.
Although the underlying AI models (e.g. LLMs) themselves could offer a wide range of possible applications, the actual system design limits their use to customer service and answering product questions. A general multi-purpose capability is not evident if the system functions exclusively via the respective configuration environment and “cannot be used without integration into the system environment”. Even if the LLM technology itself were versatile, it is not provided in a way that would allow users or other third parties to freely use the model for any purpose. Thus, although the underlying AI models may be those with a general purpose of use within the meaning of Art. 3 No. 63 of the AI Regulation, the AI system created on it in the specific form in question is not able to serve a variety of purposes due to the specific integration implemented in accordance with the assessment in recital 100 of the AI Regulation.
However, as soon as an integration of the underlying AI model is carried out in which the universal capabilities are retained and the chatbot can in fact be used or misused for a variety of other applications, Art. 3 No. 66 of the AI Regulation speaks in favour of qualification as an AI system with a general purpose. The decisive factor is whether there is broad multi-functionality in the specific deployment or whether the system is specifically limited to the support function as a chatbot.
5. Conclusion
Companies that use AI-based chatbots instead of traditional FAQs to answer product or service enquiries can benefit significantly from more efficient and user-friendly customer communication. However, the integration of such chatbots raises data protection issues, particularly with regard to the collection and processing of IP addresses, session cookies and potential “imposed data”. According to previous case law and the requirements of the GDPR, the legitimate interest in improved, automated customer service is a suitable legal basis in many cases, provided that no extensive user profiles are created and all data is only collected to the extent necessary. The rapid anonymisation or deletion of personal data by the chat reduces the risk of unauthorised access. At the same time, companies must take the obligation to provide transparent information to end users seriously and should make it clear that no decisions that are legally unfavourable to the user are made automatically.
With regard to the future AI Regulation, chatbots for answering standard questions are generally categorised as “AI systems with limited risk”. Such systems must nevertheless comply with the transparency requirements. However, particularly extensive labelling in accordance with Art. 50 of the AI Regulation is often not required if the system does not generate any new content, but merely summarises existing information. It remains to be checked whether it is an AI system with a general purpose if a very universally usable model is used behind it. In most scenarios, however, there is specialisation in a specific service role, which considerably simplifies compliance with the relevant requirements.
6. Recommendations for action
- When planning, define data protection roles at an early stage and ensure that no “imposed data” remains permanently in the system.
- Clearly state the legal basis for data processing in the documentation and data protection information, usually Art. 6 para. 1 lit. f GDPR for general use and Art. 6 para. 1 lit. b GDPR for customer support.
- Provide an appropriate information concept: before the start of the chat, point out that data should be entered sparingly and provide a comprehensible privacy policy.
- Ensure that session data is processed as briefly as possible by deleting or anonymising it quickly.
- Realistically check the categorisation of the system in accordance with the AI Regulation: If it is only an AI system with limited risk, only transparency and information obligations usually need to be observed.
- During implementation, ensure that the AI models underlying the system, whose capabilities go far beyond FAQ answers, are limited to chat use and that they are documented and monitored so as not to violate the requirements for general-purpose AI systems.