privacyFebruary 20, 202610 min read

How to Run a Private AI Workspace Without Sending Your Data to OpenAI

Most people don't realize ChatGPT's 'Improve model for everyone' is on by default. Here's how to build a private, local-first AI workspace using BYOK, encrypted key storage, and direct API calls — no middleman.

TL;DR

  • ChatGPT, Claude.ai, and Gemini all retain your conversations on their servers by default. Some train on them.
  • You can get the same models with minimal data exposure by calling the API directly — the API is governed by stricter retention policies than the consumer chat apps.
  • A local-first BYOK client (like NovaKit) stores your messages in your own browser, your keys encrypted at rest, and makes direct calls to the provider API. Nothing in the middle.
  • The whole setup takes 10 minutes and costs less than a ChatGPT Plus subscription.

Why the consumer chat apps aren't "private"

There's a persistent myth that enterprise plans fix the privacy problem. They help — but the default consumer experience of ChatGPT, Claude.ai, and Gemini was not built with privacy as the priority. Here's the actual picture.

ChatGPT (free + Plus)

  • Conversations stored on OpenAI servers for at least 30 days even after you delete them.
  • The "Improve the model for everyone" setting is ON by default on new accounts.
  • OpenAI can review conversations for policy violations and to improve the product.
  • Memory feature writes additional data about you into OpenAI's systems.

Claude.ai (free + Pro)

  • Conversations stored; Anthropic retains per their retention policy (30 days default, longer for safety-flagged content).
  • Anthropic does not train on consumer chat conversations by default (this is a real, documented policy advantage).
  • Still: the data exists on their servers and is subject to subpoena, breaches, and employee access.

Gemini (free + Advanced)

  • Conversations stored by default for up to 18 months.
  • Google may use conversations "to improve products" unless you explicitly turn off activity tracking.
  • Human reviewers may read samples of conversations.

None of this is hidden. It's all in the privacy policies. But most people never read them, and even people who do still don't realize how different "the API" is from "the app" in terms of data handling.

The API has fundamentally different rules

Here's the key distinction most people miss: when you call an AI provider's API directly, you're under their enterprise/developer retention policies, not their consumer chat policies.

As of February 2026:

  • OpenAI API: Zero Data Retention (ZDR) is available for eligible customers. Default retention for abuse monitoring is 30 days. API data is not used for training by default.
  • Anthropic API: 30-day default retention for abuse monitoring. Not used for training. Enterprise can request zero retention.
  • Google Gemini API (via Vertex/AI Studio): Paid API tier does not use your data for training. Free tier does.
  • Mistral, Groq, Together: Similar — API usage is generally not used for training, explicit enterprise controls available.

The model is the same. The privacy posture is different.

The architecture of a truly private AI workspace

Here's what "local-first BYOK" actually means in concrete terms.

The data flow (what's different)

ChatGPT consumer app flow:

You → ChatGPT client → OpenAI servers (stored, possibly reviewed, possibly used to train) → model → back

Local-first BYOK flow:

You → Your browser (keys encrypted here) → AI provider API (no training, short retention) → model → back
Conversation history stays in your browser.

The middleman is gone. Your conversation is not sitting on a third party's "chat history" database. The only service that ever sees your prompt is the AI provider itself — and they're under the more restrictive API policy.

What "local-first" means precisely

A local-first BYOK client should:

  1. Store conversation history in the browser (IndexedDB) or on disk — not on a cloud database by default.
  2. Store API keys encrypted at rest — ideally with AES-256-GCM, ideally with a key derived from a user passphrase via PBKDF2 (100k+ iterations) or Argon2.
  3. Call AI provider APIs directly from your browser — no proxy, no logging, no inspection by the client app's servers.
  4. Provide zero telemetry on your usage — no "which model did you use", "how many messages", "what topics" sent to the app maker.
  5. Let you export everything — your conversations, your prompt library, your settings — in a portable format you can delete or archive.

NovaKit meets all five. Most competing tools meet 1-2 of them. Double-check before trusting any tool that calls itself "privacy-first."

The step-by-step setup (10 minutes)

Step 1: Get API keys from the providers you want

You only need one to start. Most people end up with two or three.

  • OpenAI: platform.openai.com → API keys → Create new secret key. Fund with $10 of credit.
  • Anthropic: console.anthropic.com → Settings → API keys. Fund with $10.
  • Google AI Studio: aistudio.google.com → Get API key. Free tier available (but note: the free tier does train on your data — use the paid tier if privacy matters).

Step 2: Verify your API keys are "not for training"

Check each provider's data usage dashboard:

  • OpenAI: platform.openai.com → Settings → Data Controls → confirm "Improve the model for everyone" is OFF (it's OFF by default for API).
  • Anthropic: Console → Settings → confirm API data is not used for training (this is the default).
  • Google: Vertex console → check "Data governance" settings.

Step 3: Pick a local-first BYOK client

Your options in 2026:

  • NovaKit — Browser-based, local-first by default, 13+ providers, encrypted key storage.
  • Various open-source clients — Mostly self-hosted, require technical setup. Good if you want full control.
  • Do NOT use: Any client that stores your conversations on their servers "for convenience" without a local-only mode. Many VC-funded "AI chat apps" do this.

Step 4: Enter your keys and verify encryption

In NovaKit: Settings → API Keys → paste your key. The key is encrypted with AES-256-GCM before writing to IndexedDB. You can set a passphrase for additional security (derives a key via PBKDF2 that gates the IndexedDB encryption key).

Step 5: Use it

Start a conversation. Watch the Network tab in your browser's DevTools if you want to verify: you'll see requests going directly to api.openai.com or api.anthropic.com — not to NovaKit's servers. Your conversation data never touches our infrastructure.

What you give up

Being honest about the trade-offs:

  • No cross-device sync by default. Your conversations live on the device you created them on. (You can export and re-import manually, or opt into encrypted cloud sync where available.)
  • You manage your own API keys. Rotating, revoking, and budgeting are on you. (Modern tools make this easier than it sounds.)
  • You pay per token. This is almost always cheaper than a subscription, but it's variable — a very heavy month can cost more.
  • No ChatGPT memory or custom GPTs. You build your own prompt library and context instead.

For people who value privacy and cost control, this is a clear win. For people who want a one-click consumer experience and don't mind the data trade-off, ChatGPT Plus remains a perfectly reasonable product.

How to evaluate any "privacy-first" AI claim

Five diagnostic questions you can ask of any tool that claims to be private:

  1. Where are my conversations stored? If the answer is "on our servers by default," it's not local-first.
  2. Where are my API keys stored? If they're in the product's database instead of your browser/device, that's a meaningful exposure.
  3. What goes to the provider vs. what goes to you? If the tool is "between" you and OpenAI — proxying, logging, or inspecting — that's extra exposure.
  4. Can I export and delete everything? If not, you don't actually own your data.
  5. What telemetry do you collect? If they know your prompt count, models used, and topics, that's still data about you.

Ask these in the support chat or in their docs before trusting a new tool. The answers are usually revealing.

The broader point

We've normalized the idea that using AI means sending our data to three or four huge companies whose business model is, at least partly, to learn from us. That was the deal in 2022 when there were no alternatives. It is not the deal in 2026.

The API layer was always where the stricter privacy policies lived. BYOK clients let you use the API layer with a UI that's actually pleasant. The privacy win and the cost win come together — and they cost you about 10 minutes of setup.

The summary

  • Consumer AI chat apps store and sometimes train on your conversations by default.
  • API usage is governed by stricter, no-training policies.
  • Local-first BYOK clients put you on the API layer with a chat-style UI.
  • Setup: 10 minutes. Monthly cost: usually less than one latte.

Read more about NovaKit's specific privacy architecture at /privacy-policy or try it at /chat.


Private AI doesn't have to be complicated. Start NovaKit free → BYOK AI workspace with local-first storage, AES-256-GCM encryption, and zero telemetry. No account required to try.

NovaKit workspace

Stop reading about AI tools. Use the one you own.

NovaKit is a BYOK AI workspace — chat across providers, compare model costs live, and keep conversations on your device. No markup on tokens, no lock-in.

  • Bring your own keys
  • Private by default
  • All models, one workspace

Keep exploring

All posts