What is naxxen?
Platform for AI Agent & LLM Tools. Compress prompts, save tokens, expand context windows.
naxxen is a platform for AI Agent & LLM Tools
The first product is Compress — a drop-in LLM proxy that compresses your prompts by 30–60%, transparently.
More cognitive orchestration features are coming.
How it works
- You swap your LLM provider's base URL for
api.naxxen.ai - naxxen intercepts the request, compresses compressible text (system prompts, chat history, tool descriptions)
- The compressed request is forwarded to the real provider (OpenAI, Anthropic, Google)
- The response comes back to you unchanged
Zero code changes. Your API keys, your models, your parameters — all unchanged. You just pay for fewer tokens.
Why compress?
| Benefit | How |
|---|---|
| Save on token costs | 30–60% fewer input tokens billed by the provider |
| Expand context windows | Fit more conversation history and instructions into the same window |
| Faster responses | Fewer tokens in = lower time-to-first-token |
What naxxen does NOT do
- naxxen never stores or logs your provider API keys
- naxxen never modifies the LLM's response
- naxxen never touches code blocks, JSON, images, or your last message
- If there's nothing to compress, the request passes through with zero overhead
Supported providers
- OpenAI — GPT-4o, GPT-5.4, o3, o4, and all
/v1/chat/completionsmodels - Anthropic — Claude Opus, Sonnet, Haiku via
/v1/messages - Google — Gemini 2.5 Pro, Flash, Flash Lite via
generateContent
Next steps
- Quickstart — get running in 3 minutes
- Authentication — 4 ways to pass your naxxen key
- Integrations — set up your coding agent