Open Source Tool

xokito

Privacy engine for AI workflows. Pre-LLM deterministic obfuscation. Three privacy levels: local, obfuscated, VPC.

MIT License Python

The Privacy Paradox

AI coding assistants see everything in your codebase. API keys, customer PII, financial data. OpenAI/Anthropic promise not to train on your data. But can you afford to trust that?

# Without xokito: PII exposed
const user = { 
  email: "john.doe@acme.com",
  ssn: "123-45-6789",
  credit_card: "4532-1234-5678-9010"
};

# With xokito Level 2: Obfuscated
const user = {
  email: "TOK_EMAIL_a8f3",
  ssn: "TOK_SSN_9d2e",
  credit_card: "TOK_CC_4b7a"
};
// AI sees structure, not values. Reversible on your side.

Three Privacy Levels Explained

Choose your privacy level based on compliance requirements and infrastructure constraints

xokito Privacy Levels Comparison

🔒 Level 1: Local-First

Use case: Regulated industries (finance, healthcare, defense).

How it works: AI runs 100% locally (Ollama, LM Studio, llama.cpp). Zero external API calls. All processing happens on your hardware.

Trade-off: Smaller models (7B-70B). Less powerful than GPT-4, but mathematically zero data leakage.

Privacy: 100% | Cost: Hardware only (~$2000 GPU)

🎭 Level 2: Obfuscation (Recommended)

Use case: Most teams (balance privacy + power).

How it works: xokito intercepts code before LLM. PII replaced with deterministic tokens. "john@acme.com" → "TOK_EMAIL_a8f3". Mapping stored locally. LLM sees structure, not values.

Why deterministic? Same PII always gets same token. AI can track entities across files. "TOK_EMAIL_a8f3 in config.js is same as in auth.ts".

Privacy: 95% | Cost: Normal API rates | Reversible: Yes

🔐 Level 3: VPC Tunnel

Use case: Enterprise needing cloud LLMs with SLA.

How it works: Encrypted tunnel (TLS 1.3) to provider's VPC. Data in transit protected end-to-end. Provider contractually bound (no training on your data).

Trade-off: Trust provider's SLA. Network latency +50ms. VPC setup cost.

Privacy: 90% | Cost: +30% API premium + VPC fees

Features

🔄 Deterministic Tokens

Same PII → Same token. "john@acme.com" always becomes "TOK_EMAIL_a8f3". AI can track entities across files.

🧠 Zero Information Loss

AI sees structure + patterns. Email format preserved: "TOK_EMAIL_xxxx". Dates, IDs, patterns intact.

🔁 Reversible

You hold the key. Decode responses back to real values. AI never knows original data.

⚡ Fast

Rust-based tokenizer. 0-5ms overhead per request. You won't notice the latency.

📜 GDPR/HIPAA Ready

Compliant by design. PII never transmitted in plaintext. Audit logs included.

🔌 IDE Agnostic

Works with Cursor, Windsurf, Claude Code, or CLI. Drop-in replacement for API proxy.

Installation

<span class="text-accent"># Install via pip</span>
pip install xokito

<span class="text-accent"># Configure privacy level</span>
xokito init --level 2  # Obfuscation mode

<span class="text-accent"># Use as API proxy</span>
export OPENAI_API_BASE=http://localhost:8765
xokito start

# Your IDE now routes through xokito

Need Enterprise Features?

xokito is 100% free and open source (MIT). For enterprise teams needing centralized policy management, VPC deployments, and SLA support, check out our enterprise privacy platform.

Stop Sending PII to the Cloud

Install xokito in 5 minutes. Start protecting customer data today.

Zoomed image