Can AI Bots Read Your Encrypted Messages? The Encryption and Privacy Dilemma
AI chatbots are rapidly integrating into various digital applications. From AI-powered note-takers in online conferencing platforms, some with end-to-end encryption (E2EE), to the announced Apple Intelligence plans, promising AI features across a range of operating systems, and Meta AI’s integration in WhatsApp, these changes raise critical questions about privacy and security.

Anytime new features are added to an E2EE messaging app, it raises concerns about privacy and security. What concerns are raised by adding AI bots? How can we evaluate them? Is it possible to resolve the tension between the privacy users expect from E2EE and the data access needed for AI functionality?
Researchers from Cornell and NYU addressed these questions. They published a paper with practical recommendations for E2EE messaging platforms and regulators. It’s also important to offer practical solutions and recommendations for the public.
Background
Online messaging systems (iMessage, WhatsApp, and Signal) act as intermediaries for all user communications. E2EE is a secure communication system designed so that only the sender and intended recipients can read their communications. Encrypted messages are “ciphertexts,” while unencrypted content, including images or videos, are “plaintexts.” A core requirement of messaging privacy is that service providers and third parties cannot read these communications.
AI assistants interpret everyday language and perform computational tasks. Current AI assistants can handle a wide range of tasks, including text analysis, content creation, code generation, and language translation. These technologies use programs trained on data to identify patterns, such as large language models (LLMs) like OpenAI’s ChatGPT, Google’s Gemini, and Meta’s Llama. LLMs process complex inputs and provide contextually relevant responses.
It seems impossible for an application without message content access to initiate AI processing on that same message content. Introducing AI in systems like E2EE involves widening the definition of “end” and redefining “where is the end.”
Copying and pasting messages into a chatbot would not strictly violate E2EE design (although it might violate confidentiality and privacy norms). To facilitate this for users, the application would need to guarantee processing while preserving E2EE. This might involve processing on the device rather than the application servers. Apple Intelligence and MetaAI propose “trusted execution environments” (TEE) to protect the privacy of AI training and processing data.
However, TEEs are insufficient to meet the strict confidentiality and privacy guarantees of E2EE.
How AI Interacts with Your Encrypted Messages
AI features — such as message summarization, smart replies, and chatbots — are being seamlessly integrated into a wide range of applications. However, AI models require access to vast amounts of plaintext user data to power these tools. AI features interact with application data in two main ways. First, models receive user data as part of queries during regular feature usage, for example, a message summarization tool receives a list of messages and outputs a summary. Second, user data is generally used to train and refine AI features. For example, data could be used to fine-tune models and improve their general performance, personalize models for the usage patterns of individual users, etc.
These considerations raise significant concerns when integrating AI features in E2EE applications. Processing user content could expose sensitive user data to the parties who own the AI models. While some lightweight features can be implemented with smaller models that live on end-user devices, other AI features require offloading user data to more powerful models on the application servers, which directly conflicts with the security promises of E2EE applications.
While some privacy-enhancing tools address these issues, not all offer the same robust privacy guarantees of E2EE. Any solution must be compatible with E2EE and ensure that model processing doesn’t undermine users’ privacy expectations. Moreover, none of the existing technologies compatible with E2EE security, such as fully-homomorphic encryption (FHE), are yet practical solutions because they cannot efficiently evaluate the large AI models used in AI. Hardware-based solutions, like storing models inside TEEs, don’t meet E2EE’s strong confidentiality guarantees and raise additional security concerns.
Furthermore, even if an AI feature could somehow process user content in an encrypted manner, training AI models with E2EE data raises another security concern. AI models often “memorize” training data, which can lead to data reproduction during model responses or deliberate extraction by actors who can query the model in the form of “adversarial attacks.” Thus, even if training is private, E2EE data could be exposed to other users who can query the model.
“Whose Bot Is It?” The Tension Between AI, E2EE, and Ownership
Users often treat AI assistants as personal, but these bots belong to corporations controlling data access. In the context of E2EE messaging, users risk treating encryption as a feature that might contraindicate AI use, but it’s more than that.
Users expect their data to remain private when communicating with these bots, especially since platforms advertise E2EE features. Yet, driven by business strategy, we are seeing a trend away from privacy towards using user data to train AI models, sometimes without explicit user consent. This practice would undermine the privacy E2EE provides and puts at risk the massive privacy gains made over the last decade.
At the same time, AI assistants aren’t perfect, and many don’t want them. This, like dark patterns, adds to corporate tech failures that desensitize intuition about real measures of performance when applications work for companies, not users.
Practical Solutions and Recommendations
The technical and legal analysis is designed to inform the design and implementation of AI in E2EE platforms to protect user privacy and expectations of confidentiality. Recommendations are based on findings.
Training
Using end-to-end encrypted content to train shared AI models is incompatible with E2EE.
Processing
Processing E2EE content for AI features may be compatible with end-to-end encryption only if the following recommendations are upheld:
- Prioritize endpoint-local processing whenever possible.
- If processing E2EE content for non-endpoint-local models:
- No third party can see or use any E2EE content without breaking encryption, and
- A user’s E2EE content is exclusively used to fulfill that user’s requests.
Disclosure
- Messaging providers should not make unqualified claims that they provide E2EE if E2EE content is used by any third party for features like AI inference or training.
Opt-in Consent
- If offered in E2EE systems, AI assistant features should generally be off by default and only activated via opt-in consent.
- Obtaining meaningful consent is complex and requires careful consideration, including but not limited to the scope and granularity of opt-in/out, ease and clarity of opt-in/out, group consent, and management of consent over time.
Regulators and platforms can make design decisions to mitigate the privacy challenges posed by AI features to E2EE applications, such as:
- Opt-in Features: AI features should be off by default and only activated via explicit opt-in mechanisms. Each individual feature should have a separate opt-in mechanism, with unambiguous disclosure notices of what each feature entails. Afterward, users should be able to turn off AI features.
- Privacy Settings and Granularity: Messaging services should provide granular privacy settings that let users control what specific data is used for AI features and how much is stored or processed.
- Ease of Setting Adjustment: AI-related settings (such as turning off AI features or adjusting data usage policies) should be easy to find and navigate. There must be a low barrier to toggling off.
- Clear Disclosure: Messaging services should be transparent about when and how AI interacts with encrypted messages, specifying the security level offered by their systems, including whether AI features process user data to provide weaker privacy protections than E2EE.
- Data Ownership and Access: Services should clarify who owns and controls the data AI uses, ensuring that users understand that they are interacting with corporate models, not private, personal assistants.
Practical Solutions and Recommendations for the Public
Here are steps anyone can take to protect their privacy in the era of AI and encryption.
- Choose OS-level app permissions carefully: Be aware of which applications interact with AI features.
- Review App Settings: Regularly check the privacy settings on your applications. If you’re concerned about privacy, turn off AI-based features like message summarization or smart replies.
- Be Aware of What You’re Sharing: Be mindful of the data you share with AI services, especially personal or sensitive information that could be used for training or other purposes. When applications tell you they might use your data for training AI, believe them.
- Beware of Opt-in Conditions: Understand the limitations if you choose to invoke AI features in a private or confidential setting.
- Talk to Your Contacts: Have a conversation with relevant contacts, and make sure they aren’t inviting bots to the conversation.
Conclusion
Rapidly developed AI features raise significant security risks for E2EE application users. It’s crucial that AI innovation doesn’t come at the user privacy’s expense and that the protections expected from E2EE applications are maintained. Without perfect technical solutions, service providers should inform and empower users, with transparent disclosures of how data is processed, user-friendly consent mechanisms, and granular controls over data use.