# naxxen Documentation (Full) > naxxen is a platform for AI Agent & LLM Tools. The first product is Compress — a drop-in LLM proxy that compresses prompts by 30–60%, transparently. Zero code changes — just swap the base URL. --- # What is naxxen? naxxen intercepts LLM API requests, compresses compressible text (system prompts, chat history, tool descriptions), forwards the compressed request to the real provider (OpenAI, Anthropic, Google), and returns the response unchanged. Benefits: - Token cost savings: 30–60% fewer input tokens billed - Context window expansion: fit more into the same window - Faster responses: fewer tokens in = lower time-to-first-token What naxxen does NOT do: - Never stores or logs provider API keys - Never modifies the LLM's response - Never touches code blocks, JSON, images, or the last user message - If nothing to compress, pure passthrough with zero overhead Supported providers: OpenAI, Anthropic, Google. --- # Quickstart 1. Sign up at https://app.naxxen.ai 2. Create an API key in Dashboard → API Keys (starts with nxn-sk-) 3. Replace your provider's base URL with api.naxxen.ai 4. Pass your naxxen key via X-Naxxen-Key header or URL path prefix OpenAI example: curl -X POST https://api.naxxen.ai/v1/chat/completions \ -H "X-Naxxen-Key: nxn-sk-YOUR_KEY" \ -H "Authorization: Bearer sk-YOUR_OPENAI_KEY" \ -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hi"}]}' Anthropic example: curl -X POST https://api.naxxen.ai/v1/messages \ -H "X-Naxxen-Key: nxn-sk-YOUR_KEY" \ -H "x-api-key: sk-ant-YOUR_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{"model":"claude-sonnet-4-6","max_tokens":100,"messages":[{"role":"user","content":"Hi"}]}' Google example: curl -X POST "https://api.naxxen.ai/v1/models/gemini-2.5-flash:generateContent" \ -H "X-Naxxen-Key: nxn-sk-YOUR_KEY" \ -H "x-goog-api-key: YOUR_GOOGLE_KEY" \ -d '{"contents":[{"role":"user","parts":[{"text":"Hi"}]}]}' --- # Authentication Every request needs: (1) your naxxen API key, (2) your provider API key. 4 methods to pass the naxxen key (checked in order): Method 1 — Header: X-Naxxen-Key: nxn-sk-YOUR_KEY Method 2 — URL path prefix (recommended for coding agents): https://api.naxxen.ai/nxn-sk-YOUR_KEY/v1/chat/completions The key is stripped from the path before forwarding. Method 3 — Composite Bearer: Authorization: Bearer nxn-sk-YOUR_KEY:sk-YOUR_PROVIDER_KEY Split on colon — left is naxxen key, right forwarded to provider. Method 4 — Auth passthrough: After extracting the naxxen key, remaining auth headers are forwarded unchanged. Works with: OpenAI Bearer, Anthropic x-api-key, Anthropic OAuth, Google x-goog-api-key. Key format: prefix nxn-sk-, 20–128 chars total. Rate limits: Free tier: 60 req/min, 1000 req/day Pro tier: 600 req/min, unlimited daily --- # Endpoints OpenAI: POST /v1/chat/completions Anthropic: POST /v1/messages Google: POST /v1/models/{model}:generateContent POST /v1/models/{model}:streamGenerateContent POST /v1/models/{model}:countTokens POST /v1beta/models/{model}:generateContent POST /v1beta/models/{model}:streamGenerateContent POST /v1beta/models/{model}:countTokens Provider auto-detected from path, headers, and body. Streaming fully supported for all providers. --- # Models Tested models: OpenAI: gpt-4o-mini, gpt-4o, gpt-5.4-mini, gpt-5.4 Anthropic: claude-haiku-4-5, claude-sonnet-4-6, claude-opus-4-6 Google: gemini-2.5-flash, gemini-2.5-flash-lite, gemini-2.5-pro Any model from these providers using the same endpoints should work. --- # Compression Per-key settings (configure in dashboard): - Compression toggle: on/off - Rate: light (~30%), medium (~50%, default), aggressive (~65%) - Min token threshold: default 200 - Skip code blocks: default on What gets compressed: system prompts, chat history, tool descriptions. What does NOT get compressed: code blocks, JSON/XML, images, audio, PDFs, last user message, short text, thinking blocks. --- # Integration: OpenClaw Add providers to ~/.openclaw/openclaw.json with baseUrl set to: https://api.naxxen.ai/nxn-sk-YOUR_KEY Example for OpenAI: {"name":"naxxen-openai","baseUrl":"https://api.naxxen.ai/nxn-sk-YOUR_KEY","apiKey":"sk-proj-...","api":"openai-completions"} Same pattern for anthropic (api: "anthropic-messages") and google (api: "google-generative-ai"). --- # Integration: Claude Code Set environment variable: export ANTHROPIC_BASE_URL="https://api.naxxen.ai/nxn-sk-YOUR_KEY" Add to ~/.bashrc or ~/.zshrc. Claude Code's OAuth auth is forwarded unchanged. --- # Integration: Cursor In Cursor Settings → Models: Base URL: https://api.naxxen.ai/nxn-sk-YOUR_KEY API Key: your provider API key Or use composite: API Key = nxn-sk-YOUR_KEY:sk-YOUR_PROVIDER_KEY, Base URL = https://api.naxxen.ai --- # Integration: Opencode Set provider base URLs in config: {"providers":{"openai":{"apiKey":"...","baseUrl":"https://api.naxxen.ai/nxn-sk-YOUR_KEY"}}} Or via env vars: export OPENAI_BASE_URL="https://api.naxxen.ai/nxn-sk-YOUR_KEY" export ANTHROPIC_BASE_URL="https://api.naxxen.ai/nxn-sk-YOUR_KEY" --- # Integration: BYO Endpoint (SDKs) Python OpenAI SDK: client = OpenAI(api_key="sk-...", base_url="https://api.naxxen.ai/nxn-sk-YOUR_KEY/v1") Python Anthropic SDK: client = anthropic.Anthropic(api_key="sk-ant-...", base_url="https://api.naxxen.ai/nxn-sk-YOUR_KEY") Node.js OpenAI SDK: new OpenAI({apiKey: "sk-...", baseURL: "https://api.naxxen.ai/nxn-sk-YOUR_KEY/v1"}) Node.js Anthropic SDK: new Anthropic({apiKey: "sk-ant-...", baseURL: "https://api.naxxen.ai/nxn-sk-YOUR_KEY"}) Or use X-Naxxen-Key header via default_headers/defaultHeaders. --- # API Reference Base URL: https://api.naxxen.ai POST /* — Main proxy (auto-detect provider) GET /health — Health check POST /v1/compress — Standalone compression (text in, compressed text out) Request: {"text": "...", "rate": 0.5} Response: {"compressed_text": "...", "original_tokens": 500, "compressed_tokens": 250, "ratio": 0.5} Error codes (naxxen-specific): 400 unknown_provider — could not detect provider 401 missing_key — no naxxen key found 401 invalid_key — key invalid or revoked 429 rate_limited — rate limit exceeded 500 compression_error — compression failed 502 provider_unreachable — upstream provider down Provider errors (4xx/5xx from OpenAI/Anthropic/Google) are passed through unchanged.