If you’ve worked with APIs, microservices, or AI systems in the last decade, JSON needs no introduction.
It’s everywhere:
- REST APIs
- Configuration files
- Tool schemas
- Prompt payloads
- Model responses
JSON is human-readable, flexible, and familiar.
But in an AI-first world, JSON has quietly become a cost problem.
As LLM usage grows and token-based pricing becomes the norm, the format you use to represent data suddenly matters a lot more than it used to.
This article explains:
- Why JSON is inefficient for AI
- Why context is the real cost driver
- How TOON (Token Optimised Object Notation) works
- How switching to TOON can significantly reduce token usage and AI costs
The Hidden Cost of JSON in AI Systems
Modern AI systems don’t just process logic — they process context.
Every LLM request carries:
- Instructions
- Memory
- Tool schemas
- Structured data
- Conversation history
All of this is billed in tokens.
Now look at a typical JSON payload:
{
"context": {
"task": "Our favorite hikes together",
"location": "Boulder",
"season": "spring_2025"
},
"friends": ["ana", "luis", "sam"],
"hikes": [
{
"id": 1,
"name": "Blue Lake Trail",
"distanceKm": 7.5,
"elevationGain": 320,
"companion": "ana",
"wasSunny": true
},
{
"id": 2,
"name": "Ridge Overlook",
"distanceKm": 9.2,
"elevationGain": 540,
"companion": "luis",
"wasSunny": false
},
{
"id": 3,
"name": "Wildflower Loop",
"distanceKm": 5.1,
"elevationGain": 180,
"companion": "sam",
"wasSunny": true
}
]
}
For humans, this is clean and readable.
For an LLM, this is expensive noise.
Quotes, braces, commas, repeated keys — none of these add semantic value to the model. They only add tokens, and tokens cost money.
AI Is Getting Smarter — and More Expensive
Token pricing may look small per request, but at scale it compounds fast:
- Multi-turn conversations
- Long system prompts
- Tool-calling schemas
- Agent memory
- Retrieval-augmented generation (RAG)
Multiply that by:
- Thousands of users
- Multiple background agents
- Persistent memory injection
Your biggest AI cost is not inference — it’s context.
And JSON is one of the least efficient ways to represent context.
Why Context Efficiency Matters More Than Model Size
LLMs do not “understand” JSON structurally like humans do. They tokenize it.
That means:
{,},",:all become tokens- Repeated keys are paid for repeatedly
- Structure consumes context window space
As prompts grow:
- Important information gets truncated
- Costs rise
- Model performance degrades
So the real optimisation problem isn’t just which model you use — it’s how efficiently you speak to it.
Enter TOON: A Format Designed for AI, Not Humans
TOON (Token-Optimized Object Notation) flips the design philosophy.
Instead of:
Human-first, machine-second
It is:
AI-first, human-readable enough
The same data in TOON looks like this:
context:
task: Our favorite hikes together
location: Boulder
season: spring_2025
friends[3]: ana,luis,sam
hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}:
1,Blue Lake Trail,7.5,320,ana,true
2,Ridge Overlook,9.2,540,luis,false
3,Wildflower Loop,5.1,180,sam,true
What changed:
- No quotes
- No braces
- No commas
- Minimal punctuation
- High semantic density
Same meaning. Far fewer tokens.
Why This TOON Style Is So Powerful
1. Schema Is Defined Once
hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}
- No repeated keys
- No ambiguity
- No wasted tokens
2. Rows are Pure Meaning
1,Blue Lake Trail,7.5,320,ana,true
Every token contributes information.
3. Metadata, Schema, and Data are seperated
context→ meaning{}→ schema- rows → facts
LLMs love this separation.
Token Count Comparison (Approximate)
A useful rule of thumb:
Tokens ≈ characters / 4
Single Object
- JSON: ~240–260
- TOON: ~90–100 tokens
≈ 60–65% reduction
Now imagine:
- 100 chat messages
- 50 tool results
- 200 memory entries
This is where TOON starts saving real money.
Why LLMs Handle TOON Better
LLMs excel at:
- Pattern recognition
- Positional inference
- Implicit structure
They don’t need:
- Re-reading the same keys
- Spending context on punctuation
- Losing important info due to truncation
…repeated hundreds of times.
Less noise = more signal.
