Billing Modes

Quota supports three billing modes: user-pays (default), developer-pays, and hosted token storage.

User Pays (Default)

In user-pays mode, your users add funds to their own Quota wallet and pay for their own AI usage. You set a markup percentage on your app and earn revenue on every request. This is the default and recommended mode.

Users purchase balance in tiered dollar packages ($5-$50)
Balance is universal — works across all Quota apps
You set a markup % and keep 100% of it (no platform fee)
Payouts via Stripe Connect (daily, 7-day delay)

How it works

// Users buy balance via Stripe checkout (tiered packages)
// Then make AI requests billed to their balance:
await fetch("https://api.usequota.ai/v1/users/user_123/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": "Bearer sk-quota-your-api-key",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gpt-4o",
    messages: [{role: "user", content: "Hello!"}],
  }),
});
// Cost deducted from user's dollar balance (base + your markup)

Developer Grants

Grants are a feature within the user-pays model. You fund a grant pool and distribute free balance to new users as a promotional onboarding mechanism:

// Grant $0.50 of free balance to a new user
await fetch("https://api.usequota.ai/v1/funding/grant", {
  method: "POST",
  headers: {
    "Authorization": "Bearer sk-quota-your-api-key",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    external_user_id: "user_123",
    amount: 0.50,
    description: "Welcome bonus",
  }),
});

Once the grant is used up, users purchase their own balance to continue.

Developer Billing

In developer billing mode, all AI usage is charged to your developer account. This works well for:

Internal tools
Fixed-price products
Freemium apps where you absorb AI costs

How it works

// All requests use your API key
const client = new OpenAI({
  baseURL: "https://api.usequota.ai/v1",
  apiKey: "sk-quota-your-api-key",
});

// Cost deducted from YOUR balance
await client.chat.completions.create({...});

Hosted Token Storage

In hosted token storage mode, Quota stores OAuth tokens server-side and your app identifies users with the X-Quota-User header. You never handle tokens directly. This works well for:

Server-rendered apps (Next.js, Nuxt, SvelteKit, etc.)
Apps that want user billing without managing tokens
Quick prototypes that need user-level billing

How it works

// After the user connects via OAuth, Quota stores the token link.
// Your app just sends X-Quota-User with your API key:
await fetch("https://api.usequota.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": "Bearer sk-quota-your-api-key",
    "X-Quota-User": "user_123",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gpt-4o-mini",
    messages: [{role: "user", content: "Hello!"}],
  }),
});
// Cost deducted from the linked USER's dollar balance

See Hosted Users API for details, Core SDK for a framework-agnostic setup, or Next.js SDK for a turnkey Next.js setup.

Choosing a Mode

Pick the billing mode that matches your product and revenue model.

User Pays (Default)

Users buy their own balance via tiered dollar packages. You earn revenue through markup. Best for:

Consumer apps where users pay for AI usage
Marketplaces and plugin ecosystems
Any app where you want to earn revenue from AI usage

Developer Billing

You pay for all AI usage from your developer balance. Best for:

Internal tools and admin dashboards
Free-tier features where you absorb costs
Fixed-price subscriptions with predictable usage

User Billing (OAuth)

Users connect their own Quota wallet via OAuth and pay from their own balance. Best for:

Consumer apps with "bring your own wallet" models
Platforms where users control their own spending
Third-party integrations and plugin ecosystems

Comparison

Consideration	User Pays (Default)	Developer Billing	User Billing (OAuth)
Setup complexity	Simple	Simple	OAuth flow required
Who pays	End user (dollar packages)	Developer	End user (self-funded)
Developer revenue	100% of markup via Stripe Connect	None	100% of markup via Stripe Connect
Revenue model	Per-use markup	Subscription or free	Per-use markup

Rate Limits

All billing modes share the same rate limits. Default: 100 requests/minute per API key. Auth endpoints use stricter limits (3-5 req/min). See Authentication for rate limit headers and details.