Skip to main content
CORE FEATURE

TOON
Compression

Reduces the number of tokens in your requests by compressing JSON payloads. Fewer tokens means lower cost.

How It Works

TOON stands for Token Optimization & Output Normalization. It turns verbose JSON into a compact binary format before it reaches the LLM.

01

Your JSON Comes In

Your app sends a request with a normal JSON payload โ€” nothing changes on your side

02

TOON Compresses It

The gateway converts your JSON into a compact binary format, stripping out redundant structure

03

Fewer Tokens, Lower Cost

The compressed payload uses fewer tokens when sent to the LLM. You pay less for the same work

Before vs After

Same data, fewer tokens. Here's what a typical JSON payload looks like before and after TOON compression.

Before โ€” Raw JSON

Verbose, lots of structural tokens

{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Summarize this document..."
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1024
}
~450 tokens

After โ€” TOON Compressed

Compact binary, same meaning

<TOON:compressed>
  sys:"You are a helpful assistant."
  usr:"Summarize this document..."
  t:0.7 max:1024
</TOON>
~320 tokensโ€”29% reduction

Why It Matters

Token Costs Add Up

At scale, even small reductions save real money. A 20-30% cut across millions of requests adds up fast.

Per-Project Toggle

Turn it on for projects where it helps. Leave it off where you need raw JSON. You decide, per project.

Transparent

Your app doesn't change. Compression happens at the gateway level. Your code sends normal JSON, same as always.

Metrics Included

See compression ratio and token savings per project in your dashboard. No guessing โ€” real numbers.

~20-30%

Token Reduction

On typical JSON payloads

Per-Project

Toggle

Enable where it helps

Zero

Code Changes

Works at the gateway level

Built-In

Dashboard Metrics

Compression ratio & savings

Try TOON Compression

Toggle it on for any project. Your app sends the same requests โ€” the gateway handles the rest.

Frequently Asked Questions