How it works
cLLMHub is a two-sided marketplace. Hosts publish models and set prices. Developers load credits and pay per token. Hosts keep 85%.
For hosts — turn your GPU into a paid API
Point at your model
Run Ollama, vLLM, llama.cpp, or MLX locally — on one machine or many. Any backend that speaks the OpenAI chat completions format works.
Publish and price
Run cllmhub publish. The CLI discovers your models and registers them with the catalog. From the dashboard, set your price per million input and output tokens.
Earn as developers use it
When a developer calls your model, you keep 85% of every dollar. Track earnings live from the dashboard and request a PayPal payout whenever you want.
For developers — pay per token, no subscription
Load credits
Add USD to your account via PayPal — no subscription, no minimum. Your balance is just dollars; spend it on whichever models you want.
Create an API key
Generate a key from the dashboard. Optionally restrict it by model, IP, daily request cap, or maximum price per million tokens.
Call any model
Point any OpenAI SDK at cllmhub.com/v1 and send requests. You only pay for the tokens you use, at the price the host has posted.
One simple fee
Hosts keep 85% of every dollar earned. cLLMHub takes a flat 15% platform fee to cover routing, billing, payouts, and infrastructure. No tiers, no hidden charges. Self-consumption — when you call your own model — is always free.