BYOK billing: why we charge $19.99 when competitors charge $200

There's a part of AI-orchestration pricing that vendors would rather you didn't think about.

When you pay $200 a month for a multi-agent platform, you are not paying $200 for the software. You're paying somewhere between $10 and $40 for the orchestration, and another $160 to $190 to have the vendor resell you OpenAI and Anthropic tokens at a markup.

They buy the tokens at the providers' wholesale rate. You pay the retail rate baked into your subscription. The gap is their margin, and it's enormous, because tokens are where the real cost of a multi-agent system actually lives.

That model struck us as broken, so we built a different one.

What BYOK actually means

BYOK is short for bring your own key. Instead of NEXUS buying tokens from OpenAI, Anthropic, and Google and reselling them to you, you plug in your own API keys, issued to you directly by the providers, billed directly to your card by the providers.

When one of your agents calls GPT-4, the cost lands on your OpenAI invoice, not ours. When another calls Claude Opus, it's on your Anthropic bill. A third hits Gemini, and Google charges you. We never touch your token spend. We don't mark it up, and we don't meter it.

Our subscription covers two things, and only two: the orchestration software, and the server fleet that runs the coordination layer (the council, the memory, the routing).

The math that makes $19.99 work

Take a heavy month of real multi-agent work: 200 orchestrated tasks, around six agents per task, roughly 15,000 input tokens and 3,000 output tokens per agent call. That comes to about 21.6 million input tokens and 3.6 million output tokens over the month.

On the flat-rate model, the vendor has to price at $200 or more just to cover those API fees and still make margin. They don't know your usage in advance, so they price for the heavy user and quietly overcharge the light one. Everyone loses except the vendor.

On BYOK, the same month looks like this:

Same work, same 100-plus agents, same orchestration quality, and 40 to 60 percent cheaper, with every cost visible on your own provider invoices.

Run a lighter month, say 50 tasks instead of 200, and your API bill drops with it. On flat-rate SaaS you'd still pay $200. On NEXUS you'd pay $19.99 plus maybe $30 in usage. For a light month, that's a tenfold difference.

Why the flat-rate model exists at all

Because it's easier to sell. "Pay us $200 and forget about APIs" is a clean pitch. It feels predictable, it hides the vendor's cost structure, and it lets them charge whatever the market will bear.

The catch is that the predictability is fake. The vendor still meters you behind the curtain: rate limits, "fair use" clauses, surprise overage charges when you cross a threshold you were never shown. You thought you bought unlimited. You bought a ceiling with a hidden ladder.

BYOK turns that inside out. You already have a relationship with the provider. You already get the invoices. You already know what a token costs. You just connect that relationship to NEXUS and put orchestration on top of it.

What cheaper orchestration changes

When the orchestration layer costs $20 instead of $200, the way you use it changes too.

You can run several workflows at once

A $200 tool forces you to pick one use case to justify the spend. A $20 tool lets you keep orchestrations running for writing, research, legal review, financial analysis, and customer work in parallel, all on one subscription.

You can experiment without committing

Want to test whether multi-agent actually beats single-agent for your particular task? On flat-rate, you're locked into the full subscription just to find out. On BYOK, you pay $20 for a month and whatever your experiments consume. If they burn $15 of tokens, that's the whole cost.

You can give it to the whole team

A $20-per-seat tool is something every knowledge worker can have. A $200-per-seat tool is an executive line item. We built NEXUS to be the first kind.

Your spend caps stay yours

OpenAI, Anthropic, and Google all let you set monthly spend limits on your keys. Under BYOK, those caps protect you, directly. You set the ceiling, and no vendor gets to move it.

What we actually charge for

Three things, and only three:

  1. The orchestration engine. The hierarchical planner, the PM agents, the council, the scoped-context routing, the shared-memory layer. That's what $19.99 buys.
  2. The coordination fleet. Five owned Ollama servers handle council debate and free-tier model calls. They cost us about $15,000 a month regardless of how many customers we have, and subscription revenue covers that bill.
  3. Free models for everyone. Even Eco users get the owned models with no API key at all. You only need your own keys to call the premium models: GPT-4, Claude Opus, Gemini Pro, and the rest.

Three tiers:

At every tier, your API spend goes to your API provider. We never touch it.

The honest version of the pitch

A flat-rate competitor will always be able to put one tidy number on a pricing page. That number will always include a markup on tokens you could buy cheaper yourself. We'd rather you pay us for the orchestration, pay the provider for the models, and keep both relationships transparent. You get a better deal. We only win if the software is genuinely worth it. And nobody's hiding anything.

That's why it's $19.99 and not $200.


Next: "Specialist vs generalist agents: why 100 narrow experts beat one big brain", how narrow-scope agents outperform general-purpose ones on real work, and why we bet the architecture on it.

Join the NEXUS PRIME waitlist

Be first in line when pre-orders open.

Claim your spot

← All posts