Billing Modes
Quota supports three billing modes: user-pays (default), developer-pays, and hosted token storage.
User Pays (Default)
In user-pays mode, your users add funds to their own Quota wallet and pay for their own AI usage. You set a markup percentage on your app and earn revenue on every request. This is the default and recommended mode.
- Users purchase balance in tiered dollar packages ($5-$50)
- Balance is universal — works across all Quota apps
- You set a markup % and keep 100% of it (no platform fee)
- Payouts via Stripe Connect (daily, 7-day delay)
How it works
// Users buy balance via Stripe checkout (tiered packages)
// Then make AI requests billed to their balance:
await fetch("https://api.usequota.ai/v1/users/user_123/chat/completions", {
method: "POST",
headers: {
"Authorization": "Bearer sk-quota-your-api-key",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-4o",
messages: [{role: "user", content: "Hello!"}],
}),
});
// Cost deducted from user's dollar balance (base + your markup)Developer Grants
Grants are a feature within the user-pays model. You fund a grant pool and distribute free balance to new users as a promotional onboarding mechanism:
// Grant $0.50 of free balance to a new user
await fetch("https://api.usequota.ai/v1/funding/grant", {
method: "POST",
headers: {
"Authorization": "Bearer sk-quota-your-api-key",
"Content-Type": "application/json",
},
body: JSON.stringify({
external_user_id: "user_123",
amount: 0.50,
description: "Welcome bonus",
}),
});Once the grant is used up, users purchase their own balance to continue.
Developer Billing
In developer billing mode, all AI usage is charged to your developer account. This works well for:
- Internal tools
- Fixed-price products
- Freemium apps where you absorb AI costs
How it works
// All requests use your API key
const client = new OpenAI({
baseURL: "https://api.usequota.ai/v1",
apiKey: "sk-quota-your-api-key",
});
// Cost deducted from YOUR balance
await client.chat.completions.create({...});Hosted Token Storage
In hosted token storage mode, Quota stores OAuth tokens server-side and your app identifies users with the X-Quota-User header. You never handle tokens directly. This works well for:
- Server-rendered apps (Next.js, Nuxt, SvelteKit, etc.)
- Apps that want user billing without managing tokens
- Quick prototypes that need user-level billing
How it works
// After the user connects via OAuth, Quota stores the token link.
// Your app just sends X-Quota-User with your API key:
await fetch("https://api.usequota.ai/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": "Bearer sk-quota-your-api-key",
"X-Quota-User": "user_123",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-4o-mini",
messages: [{role: "user", content: "Hello!"}],
}),
});
// Cost deducted from the linked USER's dollar balanceSee Hosted Users API for details, Core SDK for a framework-agnostic setup, or Next.js SDK for a turnkey Next.js setup.
Choosing a Mode
Pick the billing mode that matches your product and revenue model.
User Pays (Default)
Users buy their own balance via tiered dollar packages. You earn revenue through markup. Best for:
- Consumer apps where users pay for AI usage
- Marketplaces and plugin ecosystems
- Any app where you want to earn revenue from AI usage
Developer Billing
You pay for all AI usage from your developer balance. Best for:
- Internal tools and admin dashboards
- Free-tier features where you absorb costs
- Fixed-price subscriptions with predictable usage
User Billing (OAuth)
Users connect their own Quota wallet via OAuth and pay from their own balance. Best for:
- Consumer apps with "bring your own wallet" models
- Platforms where users control their own spending
- Third-party integrations and plugin ecosystems
Comparison
| Consideration | User Pays (Default) | Developer Billing | User Billing (OAuth) |
|---|---|---|---|
| Setup complexity | Simple | Simple | OAuth flow required |
| Who pays | End user (dollar packages) | Developer | End user (self-funded) |
| Developer revenue | 100% of markup via Stripe Connect | None | 100% of markup via Stripe Connect |
| Revenue model | Per-use markup | Subscription or free | Per-use markup |
Rate Limits
All billing modes share the same rate limits. Default: 100 requests/minute per API key. Auth endpoints use stricter limits (3-5 req/min). See Authentication for rate limit headers and details.