# naxxen Documentation (Full)

> naxxen is a platform for AI Agent & LLM Tools. The first product is Compress — a drop-in LLM proxy that compresses prompts by 30–60%, transparently. Zero code changes — just swap the base URL.

---

# What is naxxen?

naxxen intercepts LLM API requests, compresses compressible text (system prompts, chat history, tool descriptions), forwards the compressed request to the real provider (OpenAI, Anthropic, Google), and returns the response unchanged.

Benefits:
- Token cost savings: 30–60% fewer input tokens billed
- Context window expansion: fit more into the same window
- Faster responses: fewer tokens in = lower time-to-first-token

What naxxen does NOT do:
- Never stores or logs provider API keys
- Never modifies the LLM's response
- Never touches code blocks, JSON, images, or the last user message
- If nothing to compress, pure passthrough with zero overhead

Supported providers: OpenAI, Anthropic, Google.

---

# Quickstart

1. Sign up at https://app.naxxen.ai
2. Create an API key in Dashboard → API Keys (starts with nxn-sk-)
3. Replace your provider's base URL with api.naxxen.ai
4. Pass your naxxen key via X-Naxxen-Key header or URL path prefix

OpenAI example:
  curl -X POST https://api.naxxen.ai/v1/chat/completions \
    -H "X-Naxxen-Key: nxn-sk-YOUR_KEY" \
    -H "Authorization: Bearer sk-YOUR_OPENAI_KEY" \
    -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hi"}]}'

Anthropic example:
  curl -X POST https://api.naxxen.ai/v1/messages \
    -H "X-Naxxen-Key: nxn-sk-YOUR_KEY" \
    -H "x-api-key: sk-ant-YOUR_KEY" \
    -H "anthropic-version: 2023-06-01" \
    -d '{"model":"claude-sonnet-4-6","max_tokens":100,"messages":[{"role":"user","content":"Hi"}]}'

Google example:
  curl -X POST "https://api.naxxen.ai/v1/models/gemini-2.5-flash:generateContent" \
    -H "X-Naxxen-Key: nxn-sk-YOUR_KEY" \
    -H "x-goog-api-key: YOUR_GOOGLE_KEY" \
    -d '{"contents":[{"role":"user","parts":[{"text":"Hi"}]}]}'

---

# Authentication

Every request needs: (1) your naxxen API key, (2) your provider API key.

4 methods to pass the naxxen key (checked in order):

Method 1 — Header:
  X-Naxxen-Key: nxn-sk-YOUR_KEY

Method 2 — URL path prefix (recommended for coding agents):
  https://api.naxxen.ai/nxn-sk-YOUR_KEY/v1/chat/completions
  The key is stripped from the path before forwarding.

Method 3 — Composite Bearer:
  Authorization: Bearer nxn-sk-YOUR_KEY:sk-YOUR_PROVIDER_KEY
  Split on colon — left is naxxen key, right forwarded to provider.

Method 4 — Auth passthrough:
  After extracting the naxxen key, remaining auth headers are forwarded unchanged.
  Works with: OpenAI Bearer, Anthropic x-api-key, Anthropic OAuth, Google x-goog-api-key.

Key format: prefix nxn-sk-, 20–128 chars total.

Rate limits:
  Free tier: 60 req/min, 1000 req/day
  Pro tier: 600 req/min, unlimited daily

---

# Endpoints

OpenAI:
  POST /v1/chat/completions

Anthropic:
  POST /v1/messages

Google:
  POST /v1/models/{model}:generateContent
  POST /v1/models/{model}:streamGenerateContent
  POST /v1/models/{model}:countTokens
  POST /v1beta/models/{model}:generateContent
  POST /v1beta/models/{model}:streamGenerateContent
  POST /v1beta/models/{model}:countTokens

Provider auto-detected from path, headers, and body. Streaming fully supported for all providers.

---

# Models

Tested models:
  OpenAI: gpt-4o-mini, gpt-4o, gpt-5.4-mini, gpt-5.4
  Anthropic: claude-haiku-4-5, claude-sonnet-4-6, claude-opus-4-6
  Google: gemini-2.5-flash, gemini-2.5-flash-lite, gemini-2.5-pro

Any model from these providers using the same endpoints should work.

---

# Compression

Per-key settings (configure in dashboard):
  - Compression toggle: on/off
  - Rate: light (~30%), medium (~50%, default), aggressive (~65%)
  - Min token threshold: default 200
  - Skip code blocks: default on

What gets compressed: system prompts, chat history, tool descriptions.
What does NOT get compressed: code blocks, JSON/XML, images, audio, PDFs, last user message, short text, thinking blocks.

---

# Integration: OpenClaw

Add providers to ~/.openclaw/openclaw.json with baseUrl set to:
  https://api.naxxen.ai/nxn-sk-YOUR_KEY

Example for OpenAI:
  {"name":"naxxen-openai","baseUrl":"https://api.naxxen.ai/nxn-sk-YOUR_KEY","apiKey":"sk-proj-...","api":"openai-completions"}

Same pattern for anthropic (api: "anthropic-messages") and google (api: "google-generative-ai").

---

# Integration: Claude Code

Set environment variable:
  export ANTHROPIC_BASE_URL="https://api.naxxen.ai/nxn-sk-YOUR_KEY"

Add to ~/.bashrc or ~/.zshrc. Claude Code's OAuth auth is forwarded unchanged.

---

# Integration: Cursor

In Cursor Settings → Models:
  Base URL: https://api.naxxen.ai/nxn-sk-YOUR_KEY
  API Key: your provider API key

Or use composite: API Key = nxn-sk-YOUR_KEY:sk-YOUR_PROVIDER_KEY, Base URL = https://api.naxxen.ai

---

# Integration: Opencode

Set provider base URLs in config:
  {"providers":{"openai":{"apiKey":"...","baseUrl":"https://api.naxxen.ai/nxn-sk-YOUR_KEY"}}}

Or via env vars:
  export OPENAI_BASE_URL="https://api.naxxen.ai/nxn-sk-YOUR_KEY"
  export ANTHROPIC_BASE_URL="https://api.naxxen.ai/nxn-sk-YOUR_KEY"

---

# Integration: BYO Endpoint (SDKs)

Python OpenAI SDK:
  client = OpenAI(api_key="sk-...", base_url="https://api.naxxen.ai/nxn-sk-YOUR_KEY/v1")

Python Anthropic SDK:
  client = anthropic.Anthropic(api_key="sk-ant-...", base_url="https://api.naxxen.ai/nxn-sk-YOUR_KEY")

Node.js OpenAI SDK:
  new OpenAI({apiKey: "sk-...", baseURL: "https://api.naxxen.ai/nxn-sk-YOUR_KEY/v1"})

Node.js Anthropic SDK:
  new Anthropic({apiKey: "sk-ant-...", baseURL: "https://api.naxxen.ai/nxn-sk-YOUR_KEY"})

Or use X-Naxxen-Key header via default_headers/defaultHeaders.

---

# API Reference

Base URL: https://api.naxxen.ai

POST /* — Main proxy (auto-detect provider)
GET /health — Health check
POST /v1/compress — Standalone compression (text in, compressed text out)
  Request: {"text": "...", "rate": 0.5}
  Response: {"compressed_text": "...", "original_tokens": 500, "compressed_tokens": 250, "ratio": 0.5}

Error codes (naxxen-specific):
  400 unknown_provider — could not detect provider
  401 missing_key — no naxxen key found
  401 invalid_key — key invalid or revoked
  429 rate_limited — rate limit exceeded
  500 compression_error — compression failed
  502 provider_unreachable — upstream provider down

Provider errors (4xx/5xx from OpenAI/Anthropic/Google) are passed through unchanged.