Data privacy in the age of AI: is there an “undo” button?
January 28 is Data Privacy Day today. Led by the National Cybersecurity Alliance, its goal is to help people and organizations understand, value and take control of their data. As we enter the AI era, however, that goal may be harder to achieve than ever.
For the past decade, conversations about data privacy have revolved around visibility and breaches. Who can see my data? Who has access to it? Can it be stolen and what would that mean? Can I delete it if I change my mind? Does it live on the (dark) web forever?
That way of thinking made sense when we could roughly locate where our data lived: databases, spreadsheets, cloud storage, marketing tools. Breaches could be identified, hackers blamed, and platforms held responsible… at least in theory.
In the age of large language models and large-scale web scraping, that mental model is starting to break down.
Vast amounts of publicly available information have already been collected, copied, and reused at scale. In many cases, entire business models - often supported by legal frameworks or government contracts - are built on extracting value from data individuals once shared freely or without fully understanding how it might be reused later, or without their consent.
As Cybersecurity Forbes contributor Tony Bradley put it in a recent article:
“By the time anyone realizes what that data can be used for, the infrastructure is already in place and the consent conversation is long over.”
The myth of deleting your digital footprint.
Deleting your digital footprint sounds reassuring. It suggests a clean reset and the ability to remove an inconvenient post, close an account, submit a request, and see the problem go away.
In reality, Large Language Models (LLMs) have already ingested huge portions of the public web: blog posts, forums, comments, documentation, social profiles, and more. Once information is public, it can be copied, cached, mirrored, archived, scraped, and reused - sometimes legally, often questionably, and almost always opaquely.
In many cases, you cannot even verify where your data ended up, let alone whether it was included in a specific training run. Deleting something today does not mean it never existed, and it certainly does not mean it was never used and learned from. That can sound bleak, but it doesn’t mean privacy is dead.
It means the goals have changed and our understanding of it must too.

Privacy now means limiting reuse.
In the AI era, privacy is less about perfect deletion and more about risk reduction. The practical objective shifts from “erase everything everywhere” to:
- Limit further exposure
- Prevent easy re-collection
- Reduce downstream reuse
- Stop future training where you can
This approach may be less comforting, but it is far more realistic.
Work backwards from the source.
The most effective privacy actions still start at the origin of the data. If you have content online that no longer reflects what you want to be associated with (old posts, bios, comments, projects) removing or editing the original source still matters. Even if copies exist, the primary reference point disappears, making future collection less likely.
You can also make content harder to rediscover:
- Remove it from search engine indexes where possible
- Request cache removals
- De-index pages you control
None of this is perfect, but it meaningfully reduces amplification.

So, what can you actually do?
There may be no universal “undo” button, but there are still concrete steps you can take to reduce exposure and limit future reuse.
1. Opt out of AI training where possible.
You might not know that most platforms opt you in for AI training by default. Many companies have updated their policies recently, and LLMs are hungry for data, so it’s worth checking. Opting out typically affects future use only but it’s still worthwhile.
Here’s how to do it:
- Adobe: Go to your privacy page, scroll to Content analysis for product improvement, and toggle off. Business or school accounts are automatically opted out.
- AWS: Follow the instructions in Amazon’s support page to opt out at the organization level.
- Figma: Starter and Professional accounts are opted in by default. At the team level, go to Settings → AI tab → Content training and switch it off. Enterprise accounts are automatically opted out.
- Google Gemini: Open Activity in your browser and select Turn Off under AI data usage.
- Grammarly: Go to Account → Settings → Product Improvement and Training and toggle off. Enterprise/education accounts are automatically opted out.
- Grok (X): Navigate to Settings → Privacy and Safety → Grok and deselect data sharing.
- LinkedIn: In Settings → Data Privacy, uncheck Use my data for training content creation AI models.
- Meta: Delete personal information from chats with Meta AI across apps.
- OpenAI (ChatGPT): In Settings → Data Controls, uncheck Improve the model for everyone. For DALL-E 3 images, use the image removal form to prevent future training.
- Perplexity: Go to your account, scroll to AI Data Retention, and turn it off.
- Slack: Administrators must email feedback@slack.com with “Slack Global model opt-out request” and include the organization URL; Slack confirms completion via email.
2. Reduce what’s easy to scrape
Public data is still the easiest for models and third parties to collect.
- Delete or edit old posts, bios, and profiles you no longer stand behind
- Remove yourself from data broker sites using DeleteMe or Have I Been Pwned
- Request search and cache removals using Google’s ‘Remove outdated content’ tool.
3. Be careful what you share with AI tools
Treat prompts as a form of publication unless explicitly stated otherwise.
- Avoid sharing personal data, sensitive documents, or internal company information
- Prefer tools with clear retention policies and opt-out options
- Regularly review privacy settings in the AI tools you already use
4. Use your legal deletion rights
Where companies actively store your data, GDPR and similar regulations still apply.
- If you’re based in the EU/UK, you can submit data access and deletion requests to apps, SaaS platforms, or services you no longer use.
- Focus on platforms and services that actively store and use your data as deletion requests won’t reach what’s already public online.
5. Prepare for impersonation, not just data breaches
AI-generated voice and video make impersonations easier than ever, using publicly available data to create realistic deepfakes that can manipulate, scam, or mislead.
Anecdote: In February 2024, a finance worker at a multinational firm was tricked into paying $25 million to fraudsters who used deepfake technology to impersonate the company’s CFO during a video call. The worker thought he was speaking with colleagues, but all participants on the call were deepfake recreations.
Practical steps to protect yourself and your contacts:
- Agree on a shared safeword or phrase with close friends, family, or colleagues to verify urgent or unusual requests.
- Use a second communication channel to confirm sensitive instructions (e.g., a quick phone call or text to verify a financial request).
- Be cautious of unsolicited video or voice requests that pressure you to act immediately - especially when money or sensitive information is involved.
This approach might sound a bit paranoid, but it’s about acknowledging a new risk that stems directly from how your data can be reused nowadays.

The bottom line.
World Data Privacy Day is often framed around control and consent. Of course, those ideas matter, but privacy today isn’t just about who can see your data. It’s about who can copy it, reuse it, remix it, and infer new things from it… at scale.
That means the realistic goal is no longer erasure. It’s making your data harder to collect again, harder to repurpose, and less useful over time. It’s not as easy as an “accept or reject” button but it’s far closer to how the digital world actually works now.





